Why Split-Run Testing Does Not Work…for Those Who are Doing it Wrong

    May 6, 2004

There is a wealth of information out there about the benefits of split-run testing or how to conduct such tests, and a huge selection of software solutions that help to implement it.

I’m am not going to attempt to describe the concepts of testing, nor will I pitch any software solution.

I want to talk about why so many people fail in their attempt to improve the conversion rate of their sites while using split-run testing techniques.

I often hear things like:

“The only thing that seems to affect my conversion rate is the headline.”


“My visitors don’t seem to react better to any changes I make besides lowering the price.”

I would say that 90% of people never bother to start testing content or offerings. And out of 10% that test their content, 90% fail to produce any positive changes – for one simple reason. I’ll tell you the reason in a little bit.

To understand the obviousness of the mistake that most people make, read the following paragraph and tell me if it makes any sense.

Every day, you randomly change some attribute on your site. On Monday, you move your newsletter sign-up box from the bottom left to the top right corner. On Tuesday, you flip a coin and if it comes up heads then you leave your changes, and if tails then you move the sign-up box back to it’s original place. On Wednesday, you remove all testimonials from your site, and on Thursday, you flip a coin again to determine if you should put the testimonials back where they were before. And so on. You do not measure the performance of your site, but simply change stuff around without thinking about it too much. You use your lucky coin to determine if the changes you made should be kept or not.

Do you really believe such course of action will achieve any positive results?

I didn’t think so.

And yet, most of the people who perform split-run tests do just that.

Here is a specific example.

Let’s imagine that you ran a small campaign for two versions of a sales letter (control and test).

Once you finished the campaign, you got the following results:

control group: 12 orders for 619 visitors – 1.94% conversion rate test group: 15 orders for 567 visitors – 2.65% conversion rate

Based on such test, you will probably conclude that your test group performed better and will make that content your new control. After all, 2.65% seems much better than 1.94%, and the total of 27 orders seems enough for a solid conclusion. It’s pretty close to the popular belief that 30 orders (or 1000 visitors) is enough to get solid data.

Well, the moment you turn your test group content into the primary version of your site is the moment you flip your lucky coin to make a decision.

You see, based on those numbers, the probability that your test does not represent the reality is 44.16%. In other words, you might have as well sent visitors to two identical versions or a page and got the similar results almost half of the time.

44.16% is pretty close to 50%. And the way I see it, 50% is just as good as flipping a coin without doing any measurements.

How did I get this number – 44.16%?

I did not use any rules of thumb, nor did I employ my gut feeling. I used statistics to calculate the exact answer to my question: How certain can I be that my test data is reliable?

I will show you a quick and easy way to get such answer in a moment, but before that, read a few words of caution that might save you a lot of time and money in the long run.

It seems that if you make enough changes with at least a little improvement added by each change, you are bound to make your content convert better. That would be true if all (or most) of your changes resulted in a positive improvement, but in reality they don’t. At least not when you flip a coin to decide the fate of a modification that you are testing.

Once you start calculating statistics and see the real numbers, you will be tempted to stop your test half-way and accept unreliable results as “good enough.” You might start fantasizing about how much more money you will make with the improved conversion rate just to get disappointed by the lack of real improvement because you decided to cut the test short.

If you do this, you will end up making changes without any substantial proof that those changes increase the conversion rate. So you will no longer be testing, but simply guessing.

The significance of your changes depends on a combination of several factors such as the difference in conversion rate, total number of visitors, and the number of sales in each group.

So how did I come up with that number?

I’m not going to give you a set of lengthy and boring statistical formulas. Instead, I’ll give you a free access to a calculator that was created for this specific purpose.

It does not cost you any money, and I don’t require your e-mail address to use it.

Here it is:


I hope is saves you at least as much time and money as it saves me, day after day.

Good luck with your testing!

Konstantin Goudkov studies the psychology of pricing, ways
to manipulate prices for maximum profit, and tactics to
control consumer price perception.
You can find his latest report at: