Interview: More Than 50% of Site Traffic May Be Hackers or Spammers

    March 14, 2012
    Abby Johnson
    Comments are off for this post.

How secure is your website? It’s a scary thought when you think about it, especially given all the attacks that have risen up in recent years. Last year, well-known websites such as Sony, Epsilon, and even government sites were hacked.

Marc Gaffan, Co-founder and VP of Marketing and Business Development at Incapsula The sad part of this is that, according to Marc Gaffan, the Co-founder and VP of Marketing and Business Development for web security firm Incapsula, since these well-known companies are falling victim to attacks, it makes small to medium-sized businesses that much more susceptible to breaches. As he explained to us, hundreds and thousands of SMBs are attacked on a daily basis.

“You’re no longer too small to be a target,” he said.

“That’s the big change or the shift in the paradigm that we’ve seen over the last year,” he added.

What are you doing to make sure your site is protected? Let us know.

Incidentally, Incapsula released a report that found 51 percent of traffic to websites is non-human. While some of this non-human traffic may be from Google, Yahoo, and other credible sources, the report found that 31 percent of it could be harmful to businesses. In other words, this 31 percent of traffic could be hackers, scrapers, spammers, spies, and other unwanted visitors.

Most businesses rely on their Google Analytics to find out where their traffic comes from. However, Gaffan said that businesses need to be aware that these other statistics exist as well that could hurt them and their customers.

Also, the question of, “Is all traffic good traffic?” becomes an issue since some of the traffic may be unwanted but not directly harmful. Gaffan compares this scenario to a person that comes into your backyard repeatedly but doesn’t bother anything. Since most people would be uncomfortable in such a situation, Gaffan believes businesses should have the same feeling in regards to their website.

He told us that businesses should put measures in place to prevent this unwanted traffic from coming to their site.

“They should have tools in place that would let the Googles and Bings and Yahoos and all the other legitimate services that need to access their website, access their website,” he said. “Anyone that has no legitimate business at their website, should be kept out.”

Incapsula takes security issues very seriously and tries to help businesses by providing a set of tools that helps with both direct and indirect breaches. The company enables businesses through its Bot Access Control capability to see and control whomever comes to their site.

Gaffan said the tools allow businesses to keep their websites clean of any harmful and potentially harmful visitors.

Incapsula also recently partnered with the Payment Card Industry (PCI) and security service providers in order to help online merchants. This is an important move for businesses because, if they don’t meet the PCI regulations, they could lose their ability to process credit card payments.

Gaffan believes this effort is another step in helping small to medium-sized businesses protect themselves and their customers.

  • Marq

    I absolutely agree that up to 50% of the traffic hitting web sites is UNPRODUCTIVE traffic. Scammers, web site scrapers, content rippers, phishers, lame hack attempts and a thousand other NON-BENEFICIAL visitors make page request against peoples web sites. And naturally, the higher the profile of your web site, the higher the amount of non-productive traffic that will be trying to scoop up your web pages.

    What web masters need to keep in mind is that when some no-name and pointless scraper is chewing up the bandwidth to your web site, it is interfering with ‘normal human traffic’. Your ‘normal human visitor’ is experiencing slow speeds when trying to compete against a Bot when trying to view a web page.

    Webmasters need to strictly enforce a ‘humans get first priority’ policy for their web sites. In my case, I strictly enforce a no-apologies banning of any IP or IP range that have proven themselves to be inconsiderate bandwidth hogs or IP ranges from which malicious attack attempts have originated from.

    This requires that web masters actively monitor the server logs of their web site. Using tools as simple as a text editor and a log analyser, you can spot malicious or inconsiderate trends. And this is something that must be done several times per day. With each review of the site traffic resulting in instant bans to the offending IPs.

    Sadly most web masters are totally incompetent at monitoring the actual logs generated for their web site traffic. And sadder still, those who rely on the lame and highly uninformative Google analytics are wide open to have continued wasteful traffic interfering with the genuine ‘human’ visitors to the web site. Google’s lame attempt to protect the IP’s of a site’s visitors by not revealing them to the web master also unfortunately protects the attackers.

    And I should note that you can not only ban the abusers by the IPs that they are using, but as well you can target specific things like user agents, keywords found in the page request strings, and keywords found in the page referring string ( plus other things ).

    So let me just say that your article is 100% correct about the fact that most sites suffer from at least 50% wasteful, pointless or non-beneficial traffic. And depending on the profile of your web site, that useless traffic could represent up to 80% of the site traffic.

    A web master who values his ‘human’ traffic… will work hard to keep the crap traffic down to less then 5% and leaving that available 95% of site bandwidth available to humans.

  • http://life-after-addiction.com/ Doug Wilson

    Agreed. I was just checking logs and thought I’d search “What percentage of traffic is … “.

    As Marq said, things like “keywords found in the page request strings” is a great way to weed out critters.

    Awstats lists “grabber” in user agents column, you will find cUrl there and others.

    Be careful though, for example “Linked In” uses Jakarta, which many people block.

    (How many “people” use msie 5 or 6) ?