We've all gotten them. The emails from someone purporting to have a lot of money that needs help sailing through the tricky waters of international banking. The ones that need us (me!) to do the important work of banking for a prince, or deceased magnate. Those messages from Nigeria. Those scam emails.
But why do those scammers write that they are from Nigeria or some other third-world country? They most often aren't, and the notion makes the whole thing seem ridiculous. Also, why do they send the emails with terrible grammer and spelling issues? Couldn't they at least proofread it once? The answer to these questions, of course, is that we are not the intended targets of the emails. Someone would have to be very gullible to fall for one of those scams, and that's exactly the point.
Cormac Herley, the principal researcher in the machine learning department at Microsoft Research, has actually crunched the numbers to prove that's the case. His paper, titled "Why do Nigerian Scammers Say They are from Nigeria?", looks at the scam from the scammer's point of view. The scammers have a limited amount of time to scam, and must prioritize the most gullible victims quickly if they want to make money. Herley visualizes this problem as one of binary classification. Will a specific mark be profitable? If the scammers guess wrong, they either spend valuable time scamming for no gain, or dismiss what could have been a profitable mark.
To solve the problem, Herley places all of the variables into a mathematical model of how a scammer might act. He then uses a Receiver Operator Characteristic (ROC) curve, which, he says, is how the trade-off between two types of error is usually graphed. From there, he is able to determine exactly how a scammer should choose which people to scam. The answer, of course, is to find a way to accurately identify from a large sample of people those who will be prone to scamming. From the paper:
The initial email is eﬀectively the attacker’s classiﬁer: it determines who responds, and thus who the scammer attacks (i.e., enters into email conversation with). The goal of the email is not so much to attract viable users as to repel the non-viable ones, who greatly outnumber them. Failure to repel all but a tiny fraction of non-viable users will make the scheme unproﬁtable. The mirth which the fabulous tales of Nigerian scam emails provoke suggests that it is mostly successful in this regard. A less outlandish wording that did not mention Nigeria would almost certainly gather more total responses and more viable responses, but would yield lower overall proﬁt. Recall, that viability requires that the scammer actually extract money from the victim: those who are fooled for a while, but then ﬁgure it out, or who balk at the last hurdle are precisely the expensive false positives that the scammer must deter.
It seems like common sense, but now at least there is proof that these scammers aren't all actually Nigerians with poor English. Herley's paper can be read (and understood if you enjoy math) in PDF form over on the Microsoft Research website. One question that remains, though, is whether the "Nigerian" scam started when someone created it to filter out all of the non-gullible people, or whether it was actually from Nigerian scammers, and just happened to catch on because of this unintended effectiveness.