Quantcast

Search Spam Comes From Few Places

Get the WebProNews Newsletter:


[ Search]

Microsoft researchers teamed up with University of California, Davis researchers to pinpoint exactly where "the bottleneck" of Web spam occurs and how legitimate advertisers inadvertently end up in bad neighborhoods. The majority of spam, they found out, comes from the same few places, and the middlemen are some names you might recognize.

The study (PDF – recommended reading to understand the whole) was authored by Microsoft’s Yi-Min Wang and Ming Ma, and UCD’s  Yuan Niu and Hao Chen, and their results are startling. Using their "Strider Search Ranger System," an automated spam detection system, the study authors found that:

  • Blogspot and AOL Hometown domains were used for the vast majority of spammy doorway pages. At least three in four (75%) Blogspot URLs appearing in the top 50 results for commercial queries were spam, totaling 22% of all spam appearances.
  • Three IP blocks accounted for huge percentages of redirection spam (links that lead to made-for-ads pages) and for spam-ads clickthrough traffic.
  • Three syndicators were located most at the center of the redirection chains: LookSmart.com; FindWhat.com; and 7Search.com.
  • Nearly 60% of keywords returning spam URLs were related to drugs and ringtones.

Based on that, the researchers developed what they called a five-layer double-funnel model to illustrate the sophisticated middleman/syndication circuit that matches advertisers with undesirable spammy URLs. Similar in structure to the double-helix of DNA models, there are complicated systems between end users (searchers) and advertisers.

While searchers are clicking in one direction, advertisements are coming the opposite way, as if passing on the road.

From the user side it goes:

Doorway — Redirection Domain — Aggregators — Syndicator — Advertiser

From the advertiser side it goes:

Advertiser — Syndicator — Aggregators — Redirection Domain — Doorway

That is the "Spam Double-Funnel."

The researches note that among the top ten Live Search results for "cheap ticket," three doorway pages appeared:

–http://-cheapticket.blogspot.com/
–http://sitegtr.com/all/cheap-ticket.html
–http://cheap-ticketv.blogspot.com/

Their ranking is related to comment spam in open forums, where the URLs are often posted. The URLs redirect to known-spammer domains like:

–vip-online-search.info
–searchadv.com
–webresourses.info

Surprisingly, ads for reputable online travel firm Orbitz showed up on all three spam domains. The researchers assume that Orbitz did not intend for its brand to appear in these bad neighborhoods. The same scenario played out for other well-known advertisers like Shopping.com, DealTime.com, BizRate.com, eBay, and Shopzilla.

Research showed that 60 percent of the redirection chains involved ads syndicated through LookSmart, FindWhat, and 7Search. But the sources for the majority of Web spam pages themselves came from three specific IP blocks:

  • 22-25% of all spam appearances originated from IP block 209.8.25.150~209.8.25.159
  • IP blocks 66.230.128.0~66.230.191.255 and 64.111.192.0~64.111.223.255 were responsible for over 100,000 spam ads, occupying the bottleneck of the spam double-funnel. The researchers say this may prove to be the best layer for attacking the search spam problem.

 The researchers conclude their study by saying:

By exposing the end-to-end search spamming activities, we hope to educate users not to click spam links and spam ads, and to encourage advertisers to scrutinize those syndicators and traffic affiliates who are profiting from spam traffic at the expense of the long-term health of the web.

 

 

Search Spam Comes From Few Places
Top Rated White Papers and Resources
  • Mark

    Are you suggesting advertisers stay away from LookSmart and FindWhat? BTW, Findwhat changed their name to MIVA over 18 months ago.
    Understanding this is the data fresh? Both LookSmart and MIVA claimed to have sworn off bad neighbors over two years ago.
    Interesting.

  • Pete

    The problem with this research is that it totally dismisses how advertisers and users feel about the results they receive. If advertisers weren’t happy with the results they would not use these services and the same goes for users – if ads aren’t compelling they won’t click. The REAL story here is why Microsoft is going after second-tier engines like MIVA, 7Search and LookSmart. Could it be that Microsoft was listing all of the blogspot domains and they were trying to figure out how to clean up their disaster of a database? Probably, well, that in relation to them launching a PPC program makes sense to me.

  • Join for Access to Our Exclusive Web Tools
  • Sidebar Top
  • Sidebar Middle
  • Sign Up For The Free Newsletter
  • Sidebar Bottom