How do you know when to end your A/B or multivariate test? Many optimization teams stop their tests too early; others run them too long. We chatted with Paul Terry, Senior Optimization Consultant at SiteSpect and former Web Optimization Analyst at PRIMEDIA, to find out what to consider before ending a test.
How do you know how long a test should run?
Tests should normally run for at least two cycles (usually two weeks) to generate informative results. It can take two to three visits (or more) before your visitors take action.
Further, tests should run until metrics maintain statistical significance for two to three days. Often, tests need to run longer before supporting metrics reach significance or at least stabilize. Stopping a test early because you think you have a winner increases the risk for statistically invalid data and may increase time bias from events and/or conversion cycles. SiteSpect’s Time Trends charts show stabilization of deltas for your primary metrics. In addition, they show when tests reach statistical significance.
Organizations new to testing typically make the mistake of running tests only on high traffic days, but this strategy does not meet the criteria for minimum duration. I don’t recommend looking at a narrow slice of time with a lot of traffic and extrapolating the entire cycle.
How do you know when to end your test?
- Check for danger signs.
End tests any time you suspect the Variation is operating incorrectly or the Variation is hurting relevant metrics to an unacceptable level. For example, if after a few days, the Variation conversion is off by 10-15% and is near statistical significance and that level of risk is too high for your organization, it’s time to end the test. It’s better to know that your hypothesis of increasing conversions was incorrect than to risk losing more conversions. In addition, running a test too long increases the risk of wasting time waiting for marginal results and consumes test samples that could be applied towards another test.
- Age-test visitors.
Sometimes, Campaigns should stop accepting new users so that time-in-Campaign is the same for all visitors before you end the test. Let tests run longer to let newer visitors age.
- Pause for QA time.
Before a release or other site change, pause your tests and re-QA them before allowing traffic to resume.