The Subpoena Thing: Gov’t Keyword Research

    January 26, 2006

It’s time to say a few words about why the government’s request for search query data bothers me. As he so often does, Danny summarizes a controversy nicely.

The Google+subpoena+mean/2100-1029_3-6029042.html?tag=nefd.pop”>FAQ at CNET is also useful.

To be clear, if Google complies in a fashion similar to what AOL and Microsoft have already done, they’ll be handing over search query information that has all the personally identifiable material stripped out. That in itself is unsurprising and uncontroversial because it’s info most of us can dredge up on demand, or at least approximate. And as Danny points out, anyone who’s taken a tour of the Googleplex has enjoyed a variety of live search displays along with their M&M’s.

Indeed, I regularly give people plenty of tips on how to gather such information for themselves for market research purposes. It’s called keyword research! You can use tools like Wordtracker, which arguably make use of metasearch data from companies like Infospace. Or you can create your very own AdWords account, run it in real time, and discover how many impressions selected queries get in a month.

What you can’t and shouldn’t be able to find out is who is doing the searching. Unfortunately that comes back to the ever-blurring line between search personalization and “convenience of user accounts” and the invasion of privacy. The major portals are glib about gathering all this info. Maybe the government wanting to look at it should be a cold splash of water in the face for Google: now you know why we’re so concerned about you looking at it.

So – keyword research is no big deal. It’s actually sort of funny the way law enforcement types try to do Internet research, asking for things that aren’t hard to get. On Search Engine Watch Forums, though, for some reason I got into the thick of this debate taking a strong stance against the government’s action. Mainly because I don’t want to be sitting here in a year when Phase 2 hits, and the personally identifiable bits are now left in. And the request is for six months’ worth. So the vice squad ramps up and 100,000 people are suddenly under investigation. Moral panic. Finger pointing. Etc. A few real criminals caught.

There is some precedent for governments accessing needed info from companies like Google, obviously, to make a “bust.” Orkut was becoming a bit of a hotbed for criminal communication (to say nothing of orgy planning in Brazil) until some “busts” put that to rest. The question is actually why they haven’t done it a lot more. Obviously, the answer must be because they don’t really have a strong agenda to track down every drug cartel or murderer on the planet. And plots and criminal activity must first be discovered by authorized means, not by spying on everyone, all the time.

So back to the current government request: taken at face value, it’s innocuous. I joked to some colleagues over coffee yesterday that our office would look great if we hooked up a live display to Metaspy and blasted the random, rolling feed of user search queries (family filtered) out into the street.

“chocolate dream cake”.
“australian open.”

Big deal, right?

How about unfiltered Metaspy? Hmm, not that many porn searches, but it’s only 6:30 p.m.

“plasma edtv”
… and plenty of others with no bad words, and nothing dirty

But the following one had no individual bad words, but a strong link to something illegal, and immoral:

“naked 13 year old girls”.

It only took three screens of random queries on Metaspy Exposed to see a puerile search like this, assuming that a search should be taken as a sincere intent to look for just that, which is debatable.

So it’s this that is at the root of the desire to investigate, and it’s neither the beginning nor end of law enforcement efforts to stop child pornography and child molestation. It’s well known that FBI agents sit in chat rooms attempting to catch pedophiles, for example.

The government then could have easily accessed this type of information without going to too much effort. And eventually they will have a full dossier on it. Not difficult.

But where does this lead?

If I’m a government law enforcement officer and the only tool I had at my disposal was the weak random sample data on Metaspy, I’d still have tons to go on. By screen 3, I’m already on the trail of someone interested in child porn. (Let’s leave aside the next question: what if you discover a lot worse searches, by a lot more people than you can possibly arrest in a lifetime?)

The next question might now be directed by law enforcement agencies directly to the owner of that query data. In this case Infospace. In some other case, Google. “We see a query for child porn here. Give us as much data on that query as possible so we can begin attempting to monitor that particular searchers online behavior.” In other words, to see if this person is doing something else, possibly illegal, and whether a case can be made against them so they’re arrested.

Would there be anything wrong with that? I know there are going to be two sides of that debate. It’s not wrong to find actual criminals, of course.

But it’s this scenario that makes me sceptical that any law and order oriented administration could resist drilling further down into the data, once they see the smoking gun. And they will. They’ll see piles of incriminating-sounding search queries. Now what?

So… what about the next government request? And the one after that? That’s the one that worries me, which is why I think it’s important to speak up now to establish the proper line where commercial interests and personal liberties must be defended, no matter what the purported “greater mission.”

And back to the everyday stuff people are searching for, “unfiltered”:

“haida clipart”
“optiquest q75 driver”
“calculate grade point average”
“e-file taxes free”
“punk rock bowling 2006 las vegas photos”
“funny quotes”

It’s at times like this when I wish every screen were as boring as that. Sadly, no.

Andrew Goodman is Principal of Page Zero Media, a marketing consultancy which focuses on maximizing clients’ paid search marketing campaigns.

In 1999 Andrew co-founded, an acclaimed “guide to portals” which foresaw the rise of trends such as paid search and semantic analysis.