3 A/B Testing Mistakes You Should Avoid


A/B, or split, testing is a tried and true method to help introduce layout and content changes, conduct audience research, and optimize conversion rates. On the surface, the process seems straightforward and universal: choose the audience, split the experience, see what variation performs better, apply. Like any tool, though, it can be used ineffectively, leading to wasted time or unreliable results—if done wrong.

Below is the list of three major things to steer clear of for better A/B testing results.

1. Not having a defined strategy and a hypothesis

Between popular from-the-rooftop examples like “company changes an action button color, gains 130+ percent conversions” and the sobering stats like “only one in seven A/B tests shows results”, the priorities, expectations, and approaches to split testing can be hard to determine. A common pitfall is to let personal tastes or numerous online lists of “X best A/B testing cases” entirely dictate what, when, and how to test. This in turn results in too much time and resources wasted on cosmetic changes, isolated or haphazard tests run in fits and starts based on unsubstantial—or even absent—hypotheses.

As an example, let’s take a look at testing open rates for different subject lines in an email marketing campaign. This process often consists of the following steps:

  • Single out a small sample group from the list of recipients.
  • Create two subject lines worded in a slightly different way and send out the email with both of them (split 50/50) to that group.
  • Wait for the results.
  • Send the email to the rest of the recipients using the ‘winner’ subject line.

This is a viable way to test individual emails; however, a better method to deploy split testing would be to use a more systemic approach. This is what you should do differently:

  • Use large sample sizes. Drive up traffic allocation to 50–100% of targeted recipients.
  • Define goals and KPIs. Use mathematical formulas and statistical significance checkers.
  • Apply two completely different approaches for constructing subject lines, and then test them both in a series of emails. For example, what prompts better engagement and reaction from your target audience—creative, unique, intriguing pitches with a hook and a personal touch? Or straightforward product offers with pricing information front and center?
  • Run tests on a series of differently worded emails (instead of just one) to check the hypothesis.

If statistical significance is reached, the results will have the potential to not just improve the performance of one marketing email but shape a strategy for numerous future campaigns.

A similar approach can be applied when testing website and app elements, content, titles, etc. Its scaled-up version could also be used to create an entire A/B testing framework for a project. First, prioritize significant changes, come up with actual hypotheses (strategy-defining and based on audience research), and define goals, and only then schedule and run tests according to the laid out plan.

2. Not segmenting the audience

Failing to properly segment audiences means committing a marketing sin—and A/B testing is no exception in this regard. Yet this aspect is often left somewhat neglected and might result in a skewed interpretation of users’ actions. Direct website traffic and visitors from organic search; people from different age groups and locations; new and returned users—they’ll often behave differently. So when preparing for a split test, it’s important to differentiate and apply segmentation.

For example, in eCommerce, new visitors are often more inclined to poke around and study provided information, while returned ones are more likely to converse. The novelty effect can come into play: those who’ve already been to your website or used your app might also notice that something has changed and pay extra attention to it, specifically because they know it’s new (a video, new offer, design change, and so on). This needs to be taken into consideration to avoid unreliable or inapplicable test results.

To research and understand the audience both prior and during testing is key. For this, use Google Analytics reports, AdWords targeting, heat and scroll maps, as well as user session recordings. Pick and choose from a multitude of testing tools for statistics collection, effective segmentation, and more facts about your visitors to boost CRO.

3. Not taking into account different platforms (browsers and devices)

A great user experience is personalized, seamless, and consistent across major platforms. For example, a video streaming service with remarkable content and a well thought-out concept will fail if the buffering is slow, UX is subpar, and UI is clunky. Unless it’s a project of one of the media giants, a killer Chrome extension or an Android application with no convenient experience offered for Safari or iOS will see a portion of its revenue lost.

The same applies to split testing: an email subject line that works for a desktop can perform poorly on a mobile device as it gets shortened to fit the screen size. The same user’s sessions across channels might be treated as different users, which can trigger different content in a split test—resulting in both inaccurate results and this user’s confusion. Rendered differently depending on the browser, content will not perform in a similar way.

It is crucial then to ensure that every user gets the version of the test that was intended for them, and that they see it in the intended way. Things to keep in mind here include:

  • Cross-browser experience. In eCommerce, 17% of abandoned checkouts can be attributed to website crashes and errors. This and other issues can be at least partially solved running cross-browser compatibility tests.

Which ones to focus on? As of 2019, Chrome’s market share has reached 62%. Safari comes second at 15%. However, where cross-platform testing is concerned, different versions, engines, and legacy browsers can amount to a lengthy list to go through. Use Google Analytics Browser & OS reports to evaluate technical data for a particular property, and employ an in-house QA solution or one of many third-party browser testing tools to run compliance checks.

  • Mobile, desktop, and multiple device experience. It depends on the product, but in general, the mobile-first approach has long entered the status quo. Today, websites, mailing templates, and poll forms are expected to be designed as adaptive and responsive from the get-go, so mobile user experience should be a priority. This tendency reflects current device ownership and usage statistics: while multi-device ownership still prevails, the number of mobile-only internet users is soon to reach 50 million in the US alone. Mobile experience tests can have an impact on 2% of users, and should always be taken into account.

That said, even with clearly defined and separated mobile vs. desktop visitors, the question of the multiple-device user journey and testing still stands. The solution? One option is to require registration to access website or in-app content. Tracking account activities then can effectively solve a host of analytical issues, including the one described above.

For cases when exclusively account-based access isn’t an option, Google User ID and Google Signals (Cross Device) reports can be of some use. Announced in 2018, the latter is still in beta, but it does provide relevant—albeit limited—information and can be used for cross-device remarketing. However, it should be noted that there is no definitive answer to the question of multiple-device user tracking just yet.

Final Thoughts

Ceaseless testing and user analytics (as well as surveys and feedback gathering) are the bread and butter of digital marketing when it comes to evaluating customer satisfaction, one of the key factors for acquisition and retention. On the road to conversion rate optimization, it is of utmost importance to experiment and split test changes methodically.

Titles can always be tweaked and button colors played with, but a research-based framework and a strategy for real issues and solutions with major results should be the priority. Acquire data, segment audiences, define goals, and focus on value—and the rest will follow.

Author Bio:

Elena Yakimova is the Head of Web Testing Department at software testing company a1qa. She started her career in QA in 2008. Now Elena’s in-house QA team consists of 115 skilled engineers who have successfully completed more than 250 projects in telecom, retail, e-commerce, and other verticals.