The Conversion Chronicles, resources for improving your online conversion rates

Multivariate and AB Testing - Frequently Asked Questions about online experiments

What are online experiments?

Online experiments is the collective name for AB and multivariate tests.

What is AB testing?

AB testing is the principle we are all familiar with: deciding which is the best of two alternatives: A or B? Suppose you have to make a decision about which slogan would be best for your campaign, you can the send slogan A to 50% of your public and slogan B to the other 50% of your public. Then compare the differences in reactions of both groups.

You can compare and contrast more than 2 alternatives. I.e. if you would like to compare 4 different pages you would have an ABCD test.

AB testing is also known as splitpa
th or champion-challenger testing.

What is multivariate testing?

A multivariate test is doing multiple AB tests on the same web page. You can choose certain elements on a page, like a header, an image, etc. and create different alternatives for each of them. Make sure you can show each alternative independent from the others. So if you show header A, it should be possible to combine this with image A and image B.

Since you are doing multiple AB tests you will enjoy the cumulative conversion improvement of every test. Suppose you would have done all these tests separate, it would have taken a lot more of your time! Besides this you will also be able to determine which element, i.e. the header or the image, had the biggest impact on conversion.

When should I do online experiments?

The first and most important condition to do an online experiment is that you have a page or campaign creative that serves a clear and measurable goal. For example a landing page for a campaign that asks visitors to leave their email address. Or a Google Adword that tries to attract as many buyers for your products as possible.

Online experiments are especially useful in these situations:

1) You want to see the steepest conversion improvement possible in the shortest period of time,
2) you would like to implement a new feature and want to know the impact on your visitors,
3) you already benefit from very small conversion improvements, like 5%,
4) you would like a trade-off between conflicting design guidelines (Jakob Nielsen).

When should I do AB testing and when should I do multivariate testing?
A multivariate test with a single element (i.e. a header) and only two variations is an AB test. So you could say that AB testing is a very simple multivariate test. So if you are new in online experiments: AB testing is a good place to start. And when you come up with a test where you would like to change more than just one element: the time is right to move from AB to multivariate testing.

What are the costs and what are the benefits?

The costs for an experiment are the following:

a) the license for the software (if needed)
b) you and your team to brainstorm about a good experiment
c) the designers and copy writers of the creatives
d) the web analyst to setup and implement the experiment and interpret the results

The benefits are:

a) the conversion improvement times the annual revenue
b) that you give your ideas and gut feeling a solid, scientific basis to build your busin ess on
c) not going through the painful process of implementing new features only to remove later when you find out, you loose customers because of these

How does the data produced from MVT fit or align with other web analytics data and reporting, or are they separate?

One complements the other, they are hand and glove.
You use the web analytics data to identify the business driver pages, the pages that are important for obtaining your business objectives. Once you know these: you know where you can gain the highest conversion improvements.
The next step is to come up with good ideas to test. You can use web analytics data to get an idea of what your visitors might interest. Suppose a lot of visitors to your site come from a specific referer, but leave immediately when arriving at your landing page. You should then take a good look at that website and think of ways of keeping them on your site.

Once you have some good ideas to test it is time to get the multivariate software to test the different headers and images to see which generates the biggest conversion improvement.

Tool Selection

Can you name a few vendors of multivariate testing tools?

The market for multivariate testing tools is changing rapidly, but these are the biggest brands in the industry:

- Optimost
- SiteSpect
- Offermatica
- Kefta
- Memetrics
- Vertster

The last one might surprise some people. However, SPSS can be used in concordance with other web analytics software. This is only suitable for the very advanced testers.

What are things to take into account when choosing a vendor?

When choosing a multivariate testing tool, make sure you ask the vendor about the following:

* Technique; there are different ways of implementing an experiment. Which one is best for your organization?
* Support; where is your support situated and what is the time of response?
* Possibilities; a.o. can you do multiple tests at the same time?
* Reporting; is it understandable? Can you interpret the results yourself?
* Price!

Implementation & setup

Is it difficult to do an experiment?

Not really. All vendors try to make the implementation process of an experiment as easy as possible. Do expect a steep learning curve when you start doing experiments though.

How do I implement an experiment?

This depends on the experiment you have in mind and the possibilities offered by the vendor of your tool. But all of them contain one or more of the following options:

This solution is used frequently. In order to implement an experiment you put pieces of JavaScript at the places where you would like to test different alternatives of i.e. an image. These contact the multivariate server and this server decides which image should be shown.

When somebody opens a page in their browser that is part of the experiment, the DNS (Domain Name Server) will recognize this and send the visitor, just for that page, to the experiment server. That server then decides which variations should be shown and will show this page instead of the normal.

Only SiteSpect supports this way of implementing experiments.

The API (Applicati on Programming Interface) is an approach for the technical very savvy. The most important difference with the other 2 approaches is that the test pages are created at your server, not in the browser.
Whenever you create a new page, i.e. in your Content Management System, you can tell the system that you would like this new page to be an experiment. When used properly you dont have to use the web interface anymore that comes with the tool. Experiments are now completely integrated with your own system.

This asks for extra development time, since this functionality doesnt come standard with most Content Management Systems.

Other options like redirects are quickly loosing ground in favor of the options described above.

Can I use multivariate testing on campaigns?

Yes, you can. You can use multivariate tests to optimize Google Adwords, banners, newsletters and all other online marketing initiatives. Multivariate testing can be used to optimize anything, from online campaigns to offline processes as assembly lines in factories that construct cars.

However the implementation of multivariate tests for campaigns is completely different then for a test of webpages. None of the usual techniques (see: How do I implement an experiment? ) can be used in Adwords, banners or e-mail. You will have to do this differently. Consult your web analyst on how to proceed with this.

How long does it take to get results or how many visitors do I need for an experiment?

This depends on two things: the amount of variations in a test and the amount of conversions. If you only want to test a few things, i.e. a header, a piece of text and an image and you think of 2 variations for each of them this means that you have 2 x 2 x 2 = 8 unique pages. If you however have four variations for each of them, you will end up with 4 x 4 x 4 = 64 unique pages. In order to come up with results you will need more visitors if you would like to show more variations. And this means more time.

Also the amount of visitors has an influence. The bigger the amount, the sooner you will have results and the less time you need for your experiment.

It is recommended, because of the well known cookie deletion problem, to keep a test shorter than 3 weeks.

Consult your Web Analyst to estimate the amount of time needed for the experiment.

If I do an online experiment, will this affect my search engine rankings?

In most cases, the answer is no. This depends however on the vendor of the testing tool. For a definite answer, consult your vendor.

Will repeat visitors see the same page?

They have to see the same page. This is one of the fundamental principles of doing an experiment. If the same visitor sees different pages on different visits it is impossible to determine which page caused the conversion.

What happens if a visitor deletes his or her cookies?

When a visitor returns to the same page the following day, he or she should be seeing the same treatment of test page. You cant make a new variation, the exact same images and text should be shown as you did the previous day.

Visitors, as with all the other web analytics sof tware, are tracked using cookies. When a visitor deletes his or her cookies, he or she will be seen as a new visitor and receive a new treatment in the experiment.

You minimize the risk on this by keeping the time you run the experiment as short as possible.

Result Interpretation

I dont know any statistics, can I do online experiments?

Yes, you can. Vendors are doing their best to make their tools as easy as possible. Everybody is able to work with it and understand how to interpret the results after a short training. Make sure your vendor has an intuitive interface for doing experiments.

What is meant with a "Level of Confidence"?

A good explanation can be found on Wikipedia.

What is meant with the "Margin of error"?

A good explanation can be found on Wikipedia.

What is (statistical) significance?

A good explanation can be found on Wikipedia.

Who is Taguchi and why is why is he so important for multivariate testing?

Genichi Taguchi is a Japanese engineer and statistician. He has made a substantial contribution to the statistics that are used in multivariate testing. Because of this a lot of vendors are mentioning his name.

A funny remark is that he used his statistical formulas to optimize factory processes like the production of cars or phone switches.

What are "interaction effects"?

Interaction effects take place when certain variations of one element have an influence on other elements. Imagine you would like to optimize your newsletter and besides different messages in your e-mail, you are also sending out e-mails with different subjects. It is a proven fact that the subject has a big influence in whether or not people are actually opening the e-mail. So you can imagine that a non-attractive subject will result in a few people actually reading the message.

A logical outcome of the test would therefore be that different variations of the message do have an influence on the click-thru to your website, unless a non-attractive subject is shown. This is called an interaction effect.

This term is used only in multivariate tests (See: "What is multivariate testing?")

What is meant by the "main effects"?

A main effect is the isolated influence of an element on the page. If your experiment contains 3 elements you are varying, i.e. an image, a piece of text and a header, the main effects would be to calculate the conversion rates for each of these elements separate. You are completely ignoring the interaction effects by doing this.

The term "main effects" is used in multivariate tests only, not AB tests (see: "What is multivariate testing?")

What is a "fully factorial design?"

A fully factorial design means that you show all different, possible combinations of the variations to your visitors.

Suppose an experiment consists of 5 elements (images, text, headers, call to actions, etc.) and 4 variations for each of these. This would mean you would have to show 4 x 4 x 4 x 4 x 4 = 1024 different pages!

The disadvantage of doing an experiment fully factorial is that you will need more visitors than when you only concentrate on the isolated or main effects of the elements y ou are testing.

The advantage is that you not only measure the main effects, but also the interaction effects.

The term "fully factorial design" is used in multivariate testing only, not AB tests (see:"What is multivariate testing?")

What is a "fractional design"?

Instead of doing a test "fully factorial", you can also use a "fractional design". This means that instead of showing all different possible combinations, you show a certain fraction of these. Just enough so you can calculate the conversion rate for each of the elements. You use a statistical trick to get the results of the pages you never every showed your visitors.

The advantage of using this is that the amount of visitors you need for a test before you get any results can be much smaller than when you use a "fully factorial design". The disadvantage however is that you are completely ignoring the interaction effects.

The term "fractional design" is used in multivariate testing only, not AB tests (see:"What is multivariate testing?")

What is so predictive about online experiments?

The outcomes of a test should not be a lucky guess. In fact the reason why you start testing your content is because you dont want to build your business on lucky guesses anymore! If you would repeat the experiment you would like to see the exact same answer on your question: "what is the best page"? And since this can be guaranteed with an 80 to 95% chance, you can predict the future conversion rate. Not with 100% certainty, but much higher than the average glass onion!

Comments? Additions? Remarks?

Please send an e-mail to Eelco van Kuik (eelco dot van dot kuik at arroba dot nl)

Reading material:

Scientific Web Site Optimization Using AB Split Testing, Multi Variable Testing And The Taguchi Method