Google’s Kenya Scraping Deception

    January 13, 2012

How does Google populate its search engine results? A process of bots crawling links to discover/refresh/update content that’s on the web. Granted, this is an elementary explanation, but it sums up the process. There are ways to block Google’s army of content-seeking bots, information Google willingly shares.

How Google uses that information, save for any misrepresentation that may occur after the data is indexed, is their business. If you don’t want your site indexed by Google’s army of link-following bots, block it in your site’s robot.txt file. Google willingly shares the directions on how to do so.

With that in mind, the recent story concerning Google and the Kenyan business directory, Mocality, which was covered extensively by The Next Web and on the Mocality blog, is the very loud noise of potential Google improprieties drowning out the very simple signal — disregarding an issue of misrepresentation — that, at the heart of the matter, the data in question was acquired the same way Google gets all of its content?

A simple content scrape of, in this case, a very public web directory?

Google apparently crawled Mocality’s business listings and contacted the business owners directly, promoting the Google-sponsored Getting Kenyan Businesses Online (GKBO) program. So far, so good, right? How Google got the data should not be an issue. What happens with it, especially if Google representatives are claiming a coalition where none exists; well, that’s another story entirely.

Unfortunately, both subjects are at issue, when only one of them should be. If the following stanza from the Mocality’s blog post about the situation is true:

…you can clearly hear Douglas identify himself as Google Kenya employee, state, and then reaffirm, that GKBO is working in collaboration with Mocality, and that we are helping them with GKBO, before trying to offer the business owner a website (and upsell them a domain name). Over the 11 minutes of the whole call he repeatedly states that Mocality is with, or under (!) Google.

Then Google has a lot to be mortified about (more on that in a moment). However, if there’s an issue of Google’s scraping of the content, something made apparent in the conclusion of Mocality’s post:

Since October, Google’s GKBO appears to have been systematically accessing Mocality’s database and attempting to sell their competing product to our business owners. They have been telling untruths about their relationship with us, and about our business practices, in order to do so. As of January 11th, nearly 30% of our database has apparently been contacted.

The misrepresentation, or, “untruths,” are indeed an issue, but Google’s scraping of the content is not. Again, robot.txt their bots out of your content if you don’t want Google accessing your hard work. As for the misrepresentation, the following statement was issued by Nelson Mattos, Vice-President for Google’s Product and Engineering, Europe and Emerging Markets:

“We were mortified to learn that a team of people working on a Google project improperly used Mocality’s data and misrepresented our relationship with Mocality to encourage customers to create new websites. We’ve already unreservedly apologised to Mocality. We’re still investigating exactly how this happened, and as soon as we have all the facts, we’ll be taking the appropriate action with the people involved.”

Now, were these employees informed to deceive the businesses they were contacting or were they doing it on their own volition? Unless the guilty employees are examined, we’ll probably never know.