MSN Search Improving Spelling Correction System

    December 7, 2004

The MSN Search team is focused on improving the system its search engine uses to correct spelling mistakes of searchers.

A blog post by Oliver Hurst-Hiller, MSN Search Program Manager, explains:

“Doing a good job of helping Search users to correct misspelled queries is super important for two main reasons: a) 5 billion crawled docs and a bleeding edge ranking algorithm can’t do much if the query isn’t spelled right and b) more than 10% of all searches are misspelled! So we made sure our new search engine included a revamped spelling correction system that’s much better than our old one.

To improve the speller we worked with Silviu Cucerzan and Eric Brill from Microsoft Research’s Text Mining, Search and Navigation Group. Silviu and Eric have developed some novel techniques for using search query statistics and iterative transformation of query strings to improve spell correction. Their published paper on this topic – Spelling correction as an iterative process that exploits the collective knowledge of web users -goes into much more detail on some of the technical thinking that inspired the spelling correction system we built.”

He continues …

“Despite being a significant improvement for MSN Search users, the new spelling correction system is still far from perfect. Looking through user feedback, there are still a few areas in which we have room for improvement. For example, we often suggest corrections for correctly spelled names, and we are seeing this a lot during our beta phase since many users query their own name to evaluate our new search engine.

An accurate search spelling correction system is essential to helping users find the information they’re looking for. So in the coming months we’ll be working to increase the accuracy of the spelling correction system. We’ll also be expanding the system to include many more languages, since the English and German-speaking users of our search engine aren’t the only imperfect spellers on the planet. :)

Rich Ord is the CEO of iEntry, Inc. which publishes over 200 websites and email newsletters.

Rich also publishes his blog WebProBlog which focuses on internet business and marketing trends.