How and when to use A/B or Multivariate Test Design
When running a testing programme you will have to decide for each test whether it should be run as an A/B test or a multivariate test. The decisions you make when designing your experiments will significantly impact important variables such as the depth of the insights, the speed of testing and delivering winning variations and therefore the impact of your testing efforts.
The 1,024 Variation Test
So when should we run that 1,024 variation multivariate test (Google actually did this) and when would an A/B test be more appropriate? First, let’s consider some of the important variables:
- Time to test - The number of variations in your experiment will affect the length that your test needs to run.
- Traffic required - Your traffic volumes will also determine how quickly you can get results.
- Depth of Insights - The number of variations and combinations will impact the amount of insight and learnings that you can draw from your test.
- Speed of Deployment of winning variations - Depending on the complexity of the experiment/variations, you may need to allocate more design and build time to tests with a large number of variations.
- Maturity of site experience - Depending on whether your experience is well optimised or needs a complete overhaul will impact the type of tests that are most suitable.
Some Caveats
- This post is written on the basic assumption that you don't have traffic to burn. Our experience is that for many businesses testing takes longer than they would like, and factoring this into experiment design is important.
- When I refer to traffic I am assuming that the traffic converts at a good rate and there will be good numbers of final conversions for testing.
A/B Testing
A/B tests can run in a few different ways either as single tests on single features, run as a series of tests (e.g. A/B, B/C, etc.) or run as straight A/B/C/n tests.
Pros
- Quicker to design, build and configure in your testing tool.
- Takes significantly less time to get meaningful results.
- Winning variables can be implemented more quickly due to faster results.
- Easier to segment further in analytics.
- Easier to evaluate a wide range of metrics
Cons
- The depth of insight is shallower. It could take a long series of tests to uncover the best combination of variables that could be discovered during a single multivariate test.
- You may never get round to testing “unlikely” ( <- assumptions, yikes!) variations/combinations which could easily be tested as part of a multivariate test.
When is it most suitable?
- When you have small or above average traffic numbers.
- When you want to segment your results, i.e. by traffic source, device, etc.
- When you want results quickly.
Multivariate Testing
Multivariate testing allows you to combine multiple variations for multiple sections of the page and identify which combination of variations combines to deliver greatest impact.
Full-factorial multivariate testing is the most common form which means if you have 3 variations in 3 sections then you will have 27 variations. Each variation will receive one-twenty-seventh of the test traffic.
Pros
- Allows you to test a number of hypotheses at once.
- Easy to involve a wider range of ideas from different team members, which can help remove any disagreement across the team.
- Greater depth of insight gained by testing a range of variations.
- You stack the odds in your favour by running more variations and making it more likely that you will see an improvement over the control.
Cons
- Depending on the scope of the test it may require more effort to prepare content, visuals and build the different variations, as well as setting up all the variables in your testing tool. This will affect how quickly you can design and launch the test. However, it may still be an efficient way to work compared to running a series of A/B tests and having to design, build, configure multiple times.
- Requires significantly more traffic in order to acceptable minimum levels required to call a a test.
- Can take a long time to reach statistical significance
- Even once you reach statistical significance, if there are too many variables or not enough traffic then it can prove difficult to segment data further when analysing the results.
When is it most suitable?
- When you have large traffic volumes
- When you are looking to refine mature page templates or designs
Batch Testing
Batch testing is essentially an A/B test with a range of variations across different sections. It can be really useful if you have a range of well researched, user-driven hypotheses and have constraints on the number of tests that you can run.
Pros
- You can test a range of hypotheses
- Quicker to run than a multivariate test
- Often quicker to implement
Cons
- Further testing will probably be required to determine the impact of individual elements. Some elements may be improvements while others may have a negative impact. You will have to rely on the net impact of changes being either positive or negative.
When is it most suitable?
- When you are looking to see the impact of a range of changes quickly without a huge number of visitors to the test page.
- When you have strong insights from user research or partially validated hypotheses from other tests.
Summary
We judge each test separately at the experiment design stage and choose the most appropriate type of test to run. In most cases we tend to use simple A/B testing because we like to get results quickly. We also like to go into a lot of detail when analysing test results and segmenting the data further to see what else we can learn. A/B testing allows us to move through testing cycles more quickly and build momentum for our testing programme. In some cases where we believe that there is significant opportunity to change a number of variables at once we will run batch tests. This approach allows us to attempt to improve net performance and then return to find the optimal configuration of variables later.
Undoubtedly, multivariate testing will give you far better data about the best combination of variables to improve conversion. For large sites with large traffic numbers and reasonable conversion rates it can provide rich and valuable insights. But multivariate testing is greedy and requires a lot more traffic and time which can be a problem for a lot of businesses.
When speed is important there are benefits to A/B testing. If a multivariate test takes 6 weeks to run whereas an A/B test could run for 2 weeks and allows you to implement an improved version then there’s a potential 4 week gap where you are not going to see the full benefit of testing.
Thanks to
Matt Lacey for sharing his advice and opinions in this post. Matt Lacey is Head of Optimisation at
PRWD. You can follow him on
Twitter or connect on
LinkedIn.