How we A/B test Looker.com at Looker

This content, written by Sooji Kim, was initially posted in Looker Blog on Aug 16, 2017. The content is subject to limited support.

If you’re anything like me, as soon as your manager came to you with the novel idea of testing the company’s website to improve conversion, you might have done a few (or all) of these things.

  1. Stare blankly into the space between their eyes and nod.
  2. Say you’ll have something for them in a week.
  3. Try to figure out where to start.
  4. Google “how to A/B test.”

And, if you do actually search “how to A/B test,” you’ll get a ton of results—62,400,000 to be exact-ish. From beginner’s guides to “proven” tactics and ideas, it can get pretty overwhelming to figure out how to get your testing strategy and process started.

So when it came to A/B testing looker.com, I started where any employee of a data-obsessed company would: with our web analytics data. With that came a starting point for testing ideas, strategies, and processes that we continue to optimize and fine-tune, which I’ll be sharing a bit of here. I’m not going to call it a “definitive guide” or claim a “proven list” of things to do, but this is how we did it. And hopefully you can learn something from our successful and failed tests…So let’s get started!

Gathering and prioritizing test ideas

Generating test ideas is both the most fun and the most difficult part of testing. When I started by searching the internets for ideas, I got more test suggestions than I knew what to do with. There are those tried and true tests you can run, from changing the copy to the color of a button. The hard part is figuring out what will have the highest impact.

Here at Looker, our resources are limited, so it was important to focus on the experiments that would get us the biggest impact with a minimal amount of resources. To do this, I turned to our web analytics data through to help us answer the following questions:

  • What are people doing on our website?
  • Where are people dropping off?
  • Who are the people that are converting?
  • Who are the people that aren’t converting?

Through that, we were able to identify key pages that needed help, and how to target those tests to specific audience segments. Great! So, web analytics gave us the what, but we were still missing the why.

To find out why people were or weren’t converting, we ran polls on key pages, asking people a series of questions that would better inform us about why they were interacting with the site a certain way. This helped us learn which parts of these high-value pages to test.

Ideas also came from a series of brainstorming sessions with different groups and departments throughout the company, surveying current users, and, yes, some Google search results.

Deploying tests

Once we had the ideas, we had to define the goals of the test and turn them into hypotheses. We couldn’t just test a button color for the sake of testing a button color. Instead, we had to formulate a test as a hypothesis so that results could be properly evaluated against the expected outcome and goal.

Once the goals and hypotheses were finalized for each test, we prioritized which tests to deploy by looking at a few variables:

  • What question do we want to answer right away?
  • Which tests will yield the highest impact?
  • Which tests require the least amount of resources?

From there, we use to deploy the test to the web visitor, working across the marketing department to develop copy, visual assets, and code the variations. A lot of our early tests were simple A/B tests that could be deployed quickly and easily, with a potential for high impact.

Measuring results

Tests would run anywhere from 2-4 weeks, depending on our traffic at any given time. More important than the length of time for a test, though, is your sample size. The smaller your sample size, the higher your risk of reaching a false positive.

Unfortunately, I’m not a statistician or data scientist. So while I understand the importance of statistical significance, I’m not the best person to figure out how to calculate that.

Thankfully, there are a lot of products out there that calculate statistical significance for you—including Looker! Looker’s Block automatically and easily allows you to see the statistical significance of a test variation for the control and test user groups. You can also drop in different dimensions to see how the user groups, key test metrics, and test variations perform for different user attributes.

/end shameless plug

Surprising results

As anyone who has run a test knows, the results can be . . . unexpected. Though a majority of our tests come back as inconclusive, there have been a few that took all of us by surprise.

Copy test

Control:

Variation:

One simple test was to change some of the wording on our to request a demo of Looker. We were surprised by how just making small changes resulted in a big impact. We updated the copy to highlight important, yet less prominent, product features and saw a 13% higher conversion rate (CVR) than the control.

This is a great example of a test that made a small change, was easy to execute, and had a significant impact on CVR on a key page.

Layout & content test

We have quite a few different personas and audiences that visit looker.com. We’re always thinking about the best way to deliver content to the visitor based on what they’re interested in.

One example of this is our . We have both a technical and a business user demo, so we decided to feature both videos on our video demo page. We believed visitors would select the video they wanted to watch. Then, if they liked the content, they would fill out the form to request a trial.

Over the next few weeks though, we noticed CVR for this page decreasing, and we didn’t know what the problem was. Was it the videos themselves? Were we showing the visitor too many choices in the action they should take? Or were they just not seeing anything they liked?

Instead of making a guess, we decided to test it.

First, we decided to simplify the page, working with the hypothesis that featuring just one video would make the next step (form-fill) clearer, which would lead to an increase in CVR for this page.

We then added a layer of complication by introducing another variable to this test. We would also test which of the two videos would convert better by either serving the visitor the technical demo or the business demo.

Control:

Variation 1:

Variation 2:

After a few weeks of testing, the results showed that our hypothesis for the layout of the page was correct. In the single-video variation, we saw a 15% increase in visits to the confirmation page. And since we’ve rolled the single-video layout to 100% of visitors, we’ve seen even bigger increases in conversions for this page.

But it wasn’t so clear on the content side. While one video did perform slightly better than the other, the results were still too close to call which meant we couldn’t reach statistical significance in the test timeframe. So the question now isn’t, “What is the best content for this page?,” but, instead, it’s, “What is the best content for this web visitor for this page right now?”

The recap (or TL;DR)

  • Ideas are everywhere! So it’s important to do your own research before you A/B test.
  • Look at both quantitative and qualitative information so you get both the what and why to formulate your tests around.
  • Make sure each of your tests has a goal and a hypothesis.
  • Prioritize your tests based on your own requirements, which should be clearly defined before you start testing.
  • Keep an eye on statistical significance so that you know your test results aren’t random or by chance.

At Looker, we’re constantly testing the website. What you see on looker.com today could be completely different from what you see on Friday—not because we’re always redesigning, but because you’ll see a different variation of that test on Friday. Be sure to stop by to see what tests we’ve got going on!

Ready to start testing? to learn more about the A/B Testing Looker Block and see how Looker can help you understand your test results.

Version history
Last update:
‎03-27-2022 10:45 PM
Updated by: