More on the Great comScore Debate

    April 25, 2007

I’ve gotten several comments on my comScore post – and because the issue is so topical I wanted to post directly about my thoughts regarding them.

The longest, most detailed and, frankly, most baffling post was by Anil Batra. In my original take on this issue, I disagreed sharply with Anil about several aspects of the comScore findings. First, I argued that knowing the "real" number of visitors to your site isn’t useless and that not all analytics is simply an exercise in trend following and comparison. In the real-world, the actual numbers often matter and matter greatly. But that being said, the main thrust of my argument was that Anil was misunderstanding the potential impact of the comScore findings by treating them as something relevant only to reporting on total site traffic. Admittedly, this is the arena comScore is concerned with. But for a web analyst, data that suggests severe problems with cookie persistence have an impact far beyond mere traffic. Lack of cookie persistence will deeply color almost any analysis that persists across session: campaign attribution, "new" visitor analysis, "repeat" visitor analysis, "customer" analysis, attrition, sales cycle analysis, etc., etc. etc.

Let’s start with Anil’s arguments concerning traffic:

"The point I was trying to make is that you have to take everything in context. Going to Gary’s example of a conference, let’s say conference A tell you they attract 5,000 visitors and the other conference B says they get 4,000 visitors. Next day a third party comes out and says that all the conferences numbers reported by any conference are inflated and actual number is 75% of what they state then what’s the net result? Well Conference A is still better than conference B. Only thing is that they each now have 3750 and 3000 visitors respectively. Every conference in the world will have the same issue, their rank is still the same. I don’t think based on this information conferences will start charging less for the booth. However the rate per visitor has gone up for you but you can’t do much, that’s the market rate. Same argument goes for sites that sell advertising based on how many users they reach."

Unfortunately, however, advertisers don’t work in a world where there only option is the web. If I tell an advertiser that my reach is 1 million visitors, that’s going to be compared to other sites with a reach of 1 million visitors but also to radio, print, TV and more. So unless Reach just flat out doesn’t matter to advertisers, I fail to see how massive and consistent mis-reporting of web site traffic isn’t an issue.

Nor is it at all the case that we should expect cookie persistence issues to be the same for every type of site. I already pointed out two significant reasons why traffic estimates would be different for different types and different sites (% of Firefox users and Percent of Heavy Repeat Visitors). So it simply isn’t the case that every sites’ numbers will be equally inflated. Which pretty much seems to demolish Anil’s case, since every site won’t be effected equally.

So let’s go on to point number two – the impact of cookie deletion on many critical web analyses. Here’s Anil’s take:

"I understand Gary’s issue about repeat users and new users. But again, if you use two different systems they will report different numbers so which one is correct?
As Jacques Warren pointed out as a response to Gary’s post, the right solution (at this time) is to provide a reason for users to not delete their cookie (or give a reason to login). If Gary care’s about repeat users then I am sure he has strategies to get them engaged and give them a reason to login (or not delete cookies). Give users a reason to be loyal and they will be. Then you won’t have to worry about cookie deletion and hence your numbers will be accurate. Till you get to that level any number is a close estimate weather it is panel based or cookie based; and is not worth loosing sleep over."

This is the part I really don’t get. This isn’t an issue about comScore vs web analytics. When you use two different systems then one of them is more correct or both are correct or both are wrong. There is a real world we are trying to measure. And the problem is that when you do an analysis of "New" visitors and a significant percentage of your "New" visitors aren’t new, then your analysis sucks. It’s that simple. You aren’t looking at the right data and you have no reason to draw any conclusions from the data. And if you do forge conclusions from the data, they are probably wrong. For an analyst to suggest that very large non-random errors in the data don’t matter is, to say the least, perplexing. It’s as if someone told me that though I meant to poll Democrats but got half Republicans it won’t impact my survey findings on the Democratic Primary!

Jacques Warren may well be right – the only solution we may be left with is asking the measurement Tail to wag the web site dog. But how can anyone think this is something we shouldn’t be worried about (Jacques certainly seems to be!). First, we have to know whether this is true and then we have to let clients (or bosses) know that if they want measurement they have to fundamentally change their measurement approach to one based on opt-in principles. If you don’t think this is a big deal,  go talk to industries that live in that world.

Nor is it reasonable to say that a site should be able to engage visitors enough to get them not to delete their cookies. If cookie deletion is a function of mass deletion or browser exit settings, that simply isn’t an option. And we all know what great success the measurement community had convincing everyone that 3rd Party cookies aren’t a privacy issue. I look forward to telling my clients that they have to become evangelists for cookie persistence. I’m sure they will love that!

And not all sites are Amazon.Com – many sites aren’t looking to marry visitors – they just want to date them occasionally. It’s unreasonable to think that many sites can get users to opt-in for measurement purposes when all they are doing is reading an article.

I’m not taking the comScore study for granted. As I pointed out in my original post, there are several good reasons to doubt the results and it may be that we don’t have to fundamentally change our ways of doing business. But as far as I’m concerned, if you’re a web analyst and you weren’t worried about the comScore results then you aren’t getting it.

Which brings me Clint’s concerns about the study being so limited. I share his concern – and it’s probably the main reason I think we shouldn’t jump to conclusions about the quality of our data. There are just too many studies based on single points of reference that turn out to be seriously flawed when applied to much larger industry. And, as I pointed out in my original post, no site is likely to be more vulnerable to serial deleters than a major portal.

I also agree with much of what Jacques says – assuming that we really do find our numbers are this flawed. It will certainly mean that sites need to focus measurement much more on opt-in users – and find many new ways to drive that level of commitment. I won’t pretend I think this is easy or that it won’t have a big impact on our business – and I’d much prefer a truly workable technical solution. And there are, as I discussed as well, technical methods for screening off the effects of cookie deletion from many kinds of analysis. If the community as a whole ends up buying into the comScore numbers I think that all of these directions will emerge as very important.