The Conversion Chronicles, resources for improving your online conversion rates

How To Stop Thinking About The Money When Split Testing Your Website

Forward to a friend      
We respect your friends privacy

One thing that constantly blinded my judgement with split-run testing was seing a huge difference between test groups early in the test. I would get anxious to stop testing and switch to the "better" group. I would keep wasting my time constantly checking test data after each conversion. I probably spent more time checking stats than I did thinking about what else to test. Most importantly, I made a lot of bad decisions because of the lack of patience.

One day, I got tired of it and decided to prove myself the theories I (was supposed to) work by.

I set up a somewhat complex test involving three distinct attributes with two values eac
h, displayed across two separate pages, and started waiting for the results.

That test had a twist - all attributes had the same values for all groups. I was basically giving placebo to all my test subjects and trying to see if the groups would behave differently.

Guess what? Initially, the best group (a specific combination of values for all attributes) had a conversion rate that was three times higher than that of the worst group. It was also twice the average rate across all groups. It wasn't just a small difference. It was HUGE.

Something like 1.5% vs 4.5% seems like enough of a difference - doesn't it?

If I ran a real test with different values, I would have stopped by then and made the best combination my new control. I would also start thinking of how I'm going to spend all that extra money that I would make with the new conversion rate. (Thoughts about extra money, promotions, or compliments from a manager are especially influencial if you just read a few articles that talk about the wonders of split-run testing in which people pitch their software or service.) I would think that I have finally found the silver bullet and will own my competitors in no time. And once again, I would be wrong.

Naturally, after I allowed more time for the test and collected more data, the numbers straightened out. But I will never forget my amazement at the initial difference that I saw, and I'm not talking 10 clicks, I've had enought data (or so I thought). That test really taught me to be patient and don't get excited ahead of time.

Since then, I started running a separate hidden attribute with 16 values. This atteribute does not affect anything, but assigns one of 16 possible values (A,B,C,D,E...) to each visitor. I use it with all my tests and when I get surprisingly good numbers for the real attributes, I always refence my data with the hidden attribute to make sure all values had equal distribution across all paths that I'm testing.

For example, if I'm testing two versions of the shopping cart to see which one results in more people clicking through to the checkout screen - I also take the hidden attribute and form two groups (A,C,E...) vs (B,D,F...) and check if they have about the same number of total visitors, visitors who got to the cart, and visitors who clicked through to the checkout screen.

If the numbers look "normal" - I check (A,B,C,D...) vs (I,J,K,L..) or some other combination. When I really have enough data, all those groups should be equal at each page of the visitor path.

This is not so much a scientific process, and I always back my test results with concrete formulas, but it does help me "calm down" and stop thinking about the money
More from this months issue | Archived chronicles | More from this author
Konstantin GoudkovAuthor: Konstantin Goudkov, Affiliate manager

Konstantin Goudkov is an affiliate manager at He specializes in split-run testing, tracking, and visitor segmentation. Konstantin researches non-traditional methods of testing and tracking that do not depend on limitations imposed by features of popular commercial software packages.