Statistical Significance – Overrated

    March 27, 2007

I’m a big advocate of measuring the success of your site, but many marketers find the statistics intimidating. Many marketers are more comfortable with the dreaded "anecdotal evidence" than they are with numbers—that’s changing, but slowly. I am wondering if the intimidation might be caused by statisticians themselves.

One of the most intimidating parts of the numbers game is statistical significance. Many marketers struggle with this concept, even those normally comfortable with numerical analysis.

To the uninitiated, when a statistician says that a number is not statistically significant, they interpret that as meaning that no conclusions can be drawn from that number. But that’s wrong. Let’s take an example.

If you tested three different e-mail messages to see which one has the highest click rate, the statistician (or your computer statistics program) might tell you that the third version was significantly worse than the first two, but that the difference between the first two was insignificant. What do you do?

You could run another test. You could eliminate the third version and retest the first two. Your numbers person will tell you that you need many more e-mail recipients to achieve statistical significance, because the two e-mails are quite close to each other in effectiveness.

If you want to “do it wrong quickly,” then just send out the one that tested better, regardless of whether it is a statistically significant difference. Why? Because the two e-mails are close enough to each other that sending either one is probably OK. And chances are that the one that tested slightly better really is better, even if you don’t know for sure.

To understand why this is true, you need to understand what “statistical significance” means—it means that you are 95 percent sure that your conclusion is correct. That means that even when you achieve statistical significance, you have a one in 20 chance of being wrong.

A different statistical methodology is gaining traction, called Bayesian probability, which takes a different approach, based on the persuasiveness of the data. Bayesian aficionados argue that when you know where you are starting from, such as the conversion rate for your shopping cart page, you can persuade yourself that a new page design is “working” with a very small number of successes. That small sample size is not statistically significant, but it is probably the right conclusion for your decision.

If you can’t stop yourself from aiming for statistical significance, remember that the bigger the change, the smaller the sample size you need. So try to go after things that raise your conversion rate ten percent rather than one percent. That might sound obvious, but often we need to make a concerted effort to “think big” rather than shooting only for small improvement.

Better yet, don’t get hung up on statistical significance at all. It’s better to make ten decisions in a row with 70 percent confidence than just one that you are 95 percent sure of. By making frequent changes, the ones that turned out to be wrong will be found out soon enough.

If you’re still thinking that you need statistics to prove that your results aren’t random, just remember the words of comedian Dick Cavett: “Just think of all the billions of coincidences that don’t happen.”