This content, written by Sooji Kim, was initially posted in Looker Blog on Aug 16, 2017. The content is subject to limited support.
If you’re anything like me, as soon as your manager came to you with the novel idea of testing the company’s website to improve conversion, you might have done a few (or all) of these things.
And, if you do actually search “how to A/B test,” you’ll get a ton of results—62,400,000 to be exact-ish. From beginner’s guides to “proven” tactics and ideas, it can get pretty overwhelming to figure out how to get your testing strategy and process started.
So when it came to A/B testing looker.com, I started where any employee of a data-obsessed company would: with our web analytics data. With that came a starting point for testing ideas, strategies, and processes that we continue to optimize and fine-tune, which I’ll be sharing a bit of here. I’m not going to call it a “definitive guide” or claim a “proven list” of things to do, but this is how we did it. And hopefully you can learn something from our successful and failed tests…So let’s get started!
Generating test ideas is both the most fun and the most difficult part of testing. When I started by searching the internets for ideas, I got more test suggestions than I knew what to do with. There are those tried and true tests you can run, from changing the copy to the color of a button. The hard part is figuring out what will have the highest impact.
Here at Looker, our resources are limited, so it was important to focus on the experiments that would get us the biggest impact with a minimal amount of resources. To do this, I turned to our web analytics data through to help us answer the following questions:
Through that, we were able to identify key pages that needed help, and how to target those tests to specific audience segments. Great! So, web analytics gave us the what, but we were still missing the why.
To find out why people were or weren’t converting, we ran polls on key pages, asking people a series of questions that would better inform us about why they were interacting with the site a certain way. This helped us learn which parts of these high-value pages to test.
Ideas also came from a series of brainstorming sessions with different groups and departments throughout the company, surveying current users, and, yes, some Google search results.
Once we had the ideas, we had to define the goals of the test and turn them into hypotheses. We couldn’t just test a button color for the sake of testing a button color. Instead, we had to formulate a test as a hypothesis so that results could be properly evaluated against the expected outcome and goal.
Once the goals and hypotheses were finalized for each test, we prioritized which tests to deploy by looking at a few variables:
From there, we use to deploy the test to the web visitor, working across the marketing department to develop copy, visual assets, and code the variations. A lot of our early tests were simple A/B tests that could be deployed quickly and easily, with a potential for high impact.
Tests would run anywhere from 2-4 weeks, depending on our traffic at any given time. More important than the length of time for a test, though, is your sample size. The smaller your sample size, the higher your risk of reaching a false positive.
Unfortunately, I’m not a statistician or data scientist. So while I understand the importance of statistical significance, I’m not the best person to figure out how to calculate that.
Thankfully, there are a lot of products out there that calculate statistical significance for you—including Looker! Looker’s Block automatically and easily allows you to see the statistical significance of a test variation for the control and test user groups. You can also drop in different dimensions to see how the user groups, key test metrics, and test variations perform for different user attributes.
/end shameless plug
As anyone who has run a test knows, the results can be . . . unexpected. Though a majority of our tests come back as inconclusive, there have been a few that took all of us by surprise.
Control:
Variation:
One simple test was to change some of the wording on our to request a demo of Looker. We were surprised by how just making small changes resulted in a big impact. We updated the copy to highlight important, yet less prominent, product features and saw a 13% higher conversion rate (CVR) than the control.
This is a great example of a test that made a small change, was easy to execute, and had a significant impact on CVR on a key page.
We have quite a few different personas and audiences that visit looker.com. We’re always thinking about the best way to deliver content to the visitor based on what they’re interested in.
One example of this is our . We have both a technical and a business user demo, so we decided to feature both videos on our video demo page. We believed visitors would select the video they wanted to watch. Then, if they liked the content, they would fill out the form to request a trial.
Over the next few weeks though, we noticed CVR for this page decreasing, and we didn’t know what the problem was. Was it the videos themselves? Were we showing the visitor too many choices in the action they should take? Or were they just not seeing anything they liked?
Instead of making a guess, we decided to test it.
First, we decided to simplify the page, working with the hypothesis that featuring just one video would make the next step (form-fill) clearer, which would lead to an increase in CVR for this page.
We then added a layer of complication by introducing another variable to this test. We would also test which of the two videos would convert better by either serving the visitor the technical demo or the business demo.
Control:
Variation 1:
Variation 2:
After a few weeks of testing, the results showed that our hypothesis for the layout of the page was correct. In the single-video variation, we saw a 15% increase in visits to the confirmation page. And since we’ve rolled the single-video layout to 100% of visitors, we’ve seen even bigger increases in conversions for this page.
But it wasn’t so clear on the content side. While one video did perform slightly better than the other, the results were still too close to call which meant we couldn’t reach statistical significance in the test timeframe. So the question now isn’t, “What is the best content for this page?,” but, instead, it’s, “What is the best content for this web visitor for this page right now?”
At Looker, we’re constantly testing the website. What you see on looker.com today could be completely different from what you see on Friday—not because we’re always redesigning, but because you’ll see a different variation of that test on Friday. Be sure to stop by to see what tests we’ve got going on!
Ready to start testing? to learn more about the A/B Testing Looker Block and see how Looker can help you understand your test results.