Akismet – The Danger of Collective Intelligence

    February 14, 2007

Akismet is a very smart and effective system for controlling comment spam on blogs, and I know thousands of bloggers swear by it, listing it as their number one plugin for their blogging platform.

Increasingly I have started having problems posting comments on popular blogs such as Problogger. Initially it was when I included a link to a relevant post, which is within Darren’s comment policy. Recently I have had comments that didn’t include a link enter the moderation queue on new posts. Maybe I am going to have to start commenting without entering an optional URL (I know that affects Spam Karma)

It is well known among those that visit Darrens blog on a regular basis that Automattic use Problogger as a yardstick – the site attracts a great deal of spam.

Collective Intelligence

No one outside Automattic know the inner workings of Akismet, but it is known that it collects data from all the blogs using the service, and uses an algorithm to determine if a comment is spam. Such a system can be very effective. If someone spams one blog, they in some way get a black flag on all blogs, although how much weighting is transferred from one blog to another is unknown.

Rogue Data

I know Matt Mullenweg wouldn’t flag criticism on his blog as spam, but that might not be the case of every blogger. The same could be true for trackbacks – instead of just deleting a trackback that was in some way critical, or from a competitor joining in the conversation, it would be easy to flag it as spam.

The Danger of Collective Intelligence

Collective Intelligence isn’t just used for blog comment spam, it is also used for email spam. Every day I pull emails out of my spam folder in gmail. Some of it is highly important, such as data sent to me from this domain, contact form results, spam karma results etc. Often those reports contain "naughty" words – one day Gmail will learn that I really need this information.

I also pull lots of email from marketers out of the spam bin. If I have signed up for the mailing list, it shouldn’t be in the spam bin, even if I might not read it every time. If you don’t take this action, the email filters have no data to work with, or might take it that if you leave the comment in the spam bin, that it made a correct decision.

My Own Commenting

Whilst I am sure there are more active blog commenters than me, I am fairly active, and always try to add value. I am a repeat commenter on many blogs, and 95% appear without any problems, even on sites running Akismet.

Here are some things I have noticed:-

  • Links in the Body – If you include a link in the body of a comment, you have a high risk of being flagged for the moderation queue. Even when the owner of the site has asked for a link to be posted, I avoid it.
  • Optional Links – In your optional link to your site, that is actively encouraged, if you use it to link not to your root domain, but to highly relevant deep content, it has a higher chance of being flagged as spam.
  • Long comments – If you make a longer comment, adding true value to the blog where you are posting it, you are more likely to be flagged as spam. I am not sure if that is because you have a higher chance of snagging a particular word filter, but I am less inclined to write long comments. I write lots of comments on blogs discussing monetization. I avoid using words such as money
  • Constructive Criticism – I avoid linking / pinging blogs that are adverse to criticism, or don’t show trackbacks. If a trackback doesn’t show up, it suggests to me that there is a chance my trackback is remaining in the moderation queue, or even worse is being flagged as spam.

A blog which isn’t comment / trackback friendly for me becomes a "bad neighbourhood", in much the same way as linking to a grey boxed website might damage your ranking in search engines, commenting on or linking with a trackback might damage your ability to comment on other sites.

Require Signups to Comment?

No matter how much you think this helps with spam, it deters constructive comments. I have declined commenting on 2 blogs today simply because they required me to sign up to their blog to place a comment.

I have just deleted the broken trackback from one of them after I wrote a long comment, only to discover after I hit submit that I needed to sign up to comment. I didn’t however flag it as spam as many would.

Liability and Reputation Management

Reputation management is often discussed – I frequently get comments from the owners of various products and services I review, whether the post was positive or negative. I always try to be constuctive in my opinions, and even when what I say is critical in some way, generally the time I have spent looking at something in depth is appreciated.

Due to collective intelligence, the actions of a rogue webmaster who flags critical comments and trackbacks as spam, could prevent legitimate commenters voicing their opinion on hundreds of other blogs. Who would ultimately be responsible, Akismet or the webmaster who flagged comments as spam that were just voicing a different opinion?

Who knows, maybe that is why I am having increasing difficulties posting comments on Problogger, and other high traffic blogs using Akismet. I am fairly certain that Darren has never flagged one of my comments as spam, and I have been leaving comments there for some time. Surely Akismet should have learned by now?

Spammers are actively working to improve their Akismet reputation by posting comments containing absolutely no links. If a comment gets approved or slips through, many comment systems would give a significant bonus for any future comments, they would no longer be a first time commenter.

GeoTargeting Blacklists

I am not sure if GeoTargeting is used in some way for blog spam blacklists. I really hope it isn’t. I am based in Poland, maybe that is having an increasing affect on my ability to post comments. Blogging is a global community.

Tips For Akismet Users

Many people who use Akismet as a way to control comment spam believe that it saves them a lot of time. In many cases that is true, but there are certainly an increasing number of posts I have read regarding false positives.

This post isn’t intended to deter people from using Akisment, but to be aware of the effects of collective intelligence and how they can help to improve it.

  • Use delete – Don’t flag critical comments and trackbacks as spam – you could always leave them, or answer them in a constructive manner – the latter shows real class
  • Monitor your spam frequently – don’t just assume everything Akismet catches is spam – not taking action could be harming your customers ability to give feedback
  • Actively recover comments even if they are critical – you can always delete the comments afterwards – I don’t know if comments detected as spam and not recovered affect the collective intelligence, but having briefly looked at the Akismet WP plugin code, white flag signals are being sent.
  • Upgrade Akismet – newer versions of Akismet plugins use different functions to check for spam, and sometimes remove functions that were determined to cause too many false positives.

The Automattic guys are constantly working on Akismet to improve it. They rely in part on the data you provide them, both flagging comments as spam, and recovering comments from the sin bin.

Maybe one day I will happily use Akismet, but currently I am uncomfortable by the amount of times my own comments end up in someone’s moderation queue, only to sometimes appear a few hours or days later when a good webmaster does some house keeping. It interrupts the flow of conversation.


Bookmark WebProNews:

About the Author

Andy Beard – Niche Marketing – Blog search engine perfomance, WordPress and general niche and affiliate marketing tips