Will Microsoft Search Kill Apache?

    January 5, 2004

Microsoft dominates the software world because they are better at strategy than anyone else. Beginning with the way they beat everyone to the punch with Altair BASIC, to their decision to embrace the PC platform, over and over we see very sharp strategy from Microsoft. MSN is already a very popular search service. They’re also an ISP, and a major portal. 1/3 of web users use MSN for one thing or another, not necessarily search, but MSN is already big. They serve about 12-15% of the searches on any given day.

Microsoft has some things that nobody else in the search world has:

  • A dominant position in desktop operating systems, with minimal competition.
  • A dominant position in corporate network platforms, with some competition in certain segments.
  • A dominant position in server operating systems, with stiff competition from Linux.
  • A web server (IIS) with a (declining) 21% share of the market (according to Netcraft’s latest survey).
  • Microsoft is capable of building and deploying anything, pretty much as quickly as they want to.
  • Microsoft has a giant pile of cash and can buy anything that’s for sale.

    Microsoft wants to win the server wars. Losing the server wars to Linux/Apache will lead to erosion in corporate networks, and most of the industry is aligning around Linux. Beat back Linux on the server front, and Linux will never threaten the desktop. Microsoft has embraced XML, and they are serious about their .NET strategy.

    The purpose of the MSNbot project is to get real world experience in crawling and indexing the web, to learn how to do it efficiently, and to build up a database of web content that they can test search algorithms against. Similar to the Stanford WebBase Repository, which Google was built on, and which is still used for research projects at Stanford and elsewhere.

    Microsoft is crawling the web to learn how to crawl the web, but the web is not the only thing their customers want to search. Their customers want to search their own computers, their own networks, trusted information sources, documentation, etc. Microsoft knows this. Microsoft also knows something about natural language searching – if you don’t believe me fire up MS Office and ask the paperclip how to do something.

    Crawling the web is inefficient, expensive, and slow. The data you get from crawling the web is old. I don’t believe that Microsoft’s strategy will be centered around crawling the web – crawling the web will be part of it, but crawling is not the future, it’s the past.

    Microsoft will not be trying to create another knockoff search engine, they will want to make something that is clearly better. Microsoft doesn’t build stuff because someone else has one, it’s always part of a larger strategy.

    Microsoft’s focus on search technology will include:

  • desktop search (content of documents)
  • network search (intranet, servers, network storage)
  • trusted source indexing and search (intranet and extranet)
  • automated indexing of trusted web content
  • spider-crawled web content

    Understand the strategy, and you can see how Microsoft could become the biggest player very quickly. Web servers are the only segment where Microsoft is losing badly in the market today. Badly, as in a decline from 35% to barely 20% in 18 months. But could they leverage the MSN portal, and their search services, to change that?

    What if Microsoft built an XML-based (.NET) indexing service into a future release of IIS, and/or released a .NET indexing service for current releases? This would give IIS servers a built-in site search (with Microsoft search technology), which Apache servers don’t have. This would also allow them to accept trusted index feeds from IIS servers and show that content on MSN search, which Apache can’t do. They’d be able to get any changed content indexed almost instantly. They’d be able to capture huge amounts of dynamic content that you can’t get by crawling – the “hidden” web.

    If adopted by all IIS users, this would allow Microsoft to have an up-to-date index of 20% of the web. Google doesn’t even have 10% of the web, and although their index is fresher than anyone else’s, it’s nowhere near as current as what Microsoft could create. As use of Microsoft/MSN search increases, more websites will run on IIS, and enable the indexing service, because of the additional search-related traffic it helps to generate. A snowball effect.

    MSN would have a fresher (and larger) index of the web, therefore the ability to deliver better search results. Teoma would probably have better results than Google, if they had a comparable index. You could possibly say the same about Looksmart’s Wisenut. By using the same indexing system that they are developing for Longhorn, Microsoft would be able to build a seamless search system to drive search from the desktop to the farthest reaches of the Internet, and build it into 95% of the desktops in the world, and 95% of the browsers.

    How many Apache users would then be willing to pay for a Microsoft server OS and switch to IIS? Not all of them, but the number is greater than zero.

    Dan Thies is a well-known writer and teacher on search engine marketing. He offers consulting, training, and coaching for webmasters, business owners, SEO/SEM consultants, and other marketing professionals through his company, SEO Research Labs. His next online class will be a link building clinic beginning March 22