Quantcast

SES – Meet The Crawlers

Get the WebProNews Newsletter:
[ Search]

Representatives from major crawler-based search engines cover how to submit and feed them content, with plenty of Q&A time to cover issues related to ranking well and being indexed.

Moderator:

  • Danny Sullivan, Conference Co-Chair, Search Engine Strategies San Jose

Speakers:

  • Peter Linsley, Sr. Product Manager, Ask.com
  • Evan Roseman, Software Engineer, Google, Inc.
  • Sean Suchter of Yahoo! Search.
  • Eytan Seidman, Microsoft

First to speak is Eytan Seidman from Microsoft. He shows a presentation on Microsoft’s Live Webmaster Portal which explains how Microsoft’s crawler will index your site. Live Webmaster Portal supports map submissions and one can also view their website’s statistics. Microsoft has many search engine crawlers and all their names begin with "MSNBot" -

  • web search
  • news
  • academic
  • multimedia
  • user agent

Microsoft also supports "NOODP" and "NOCACHE" tags.

Next is yahoo! Search’s Sean Suchter who also has a presentation about Yahoo’s crawler.

dynamic URL rewriting via Site Explorer "Robots-nocontent" tag. Yahoo! employs crawler load improvements (reduction and targeting). The new Yahoo! search engine crawler targets better and has a comparatively low volume.

Google’s Evan Roseman steps up to explain and discuss webmaster central’s features. He recommends taking advantage of Webmaster central’s submit a site option so that Google’s search engien crawler can index all your content.

Next up is Ask.com’s Peter Linsley who discusses catering to the search engine robot as many times in catering to the actual human visitor, the robot is forgotten. Some problems include requiring cookies. He points out that Ask does accept site map submissions but points out that they’d rather be able to crawl naturally.

Peter uses the Adobe site to demonstrate some issues that they may have with multiple domains and duplicate content. He then uses the Mormon.org site and shows that they are disallowing crawlers to index the root page. This creates problems with crawling.

Q & A

  • Q: First question is for the Google rep. Wants to know whether they will allow users to see supplemental results within Webmaster Central now that they are no longer tagging them in search results.
  • A: Evan stated that being in supplemental is not a penalty but did not provide a definite answer as to whether they would allow users to discover if or not results are supplemental.

    Danny interjects that all engines have a two-tier system and Eytan, Sean and Peter confirmed that. So… they all have supplemental indices but people only seem to be concerned with Google’s, most likely because they used to identify them as such in the regular search results.

  • Q: What can a competitor actually do if anything to hurt your site?
  • A: Evan says that there is a possibility where a competitor could hurt your site but did say it is extremely difficult. Hacking, domain hi-jacking are some of the things that can occur.
  • Q: Question relates to scenario when you re-publish content to places such as eBay but the sites you re-publish to rank better than original. How can a webmaster identify original source of information?
  • A: Peter answers that one could try to get places they republish content to use robots.txt to block spidering of content. Another thing to do is have link back to original site. However on a site such as eBay, that is not always possible. The response to that is to create unique content for these sites that this person is re-publishing content on.
  • Q: Robert Carlton asks if all engines are moving towards having things like Webmaster Centrals. Also asks how they treat 404s and 410s.
  • A: As for 404s and 410s, Ask, Google and Yahoo! treat them the same. Robert points out that they should treat them differently as a 410 indicates the file is gone whereas 404 is an error.
  • Q: Question regarding getting content crawled more frequently.
  • A: Evan suggest to use the Site Map feature in Webmaster Central and keep it up to date. He also suggest promoting it by placing a link to it on the home page of their site.
  • Q: How can one use site maps more effective for very larges site that have information changing on a regular basis? Also inquired how to get more pages indexed when only a portion are being indexed.
  • A: Submitting a site map with Google is not going to cause other URLs to not be crawled. Evan also points that they are not going to be able to crawl and include ALL the pages that are out there. Again suggests that webmaster promote them such as listing them on home page. However when dealing with hundreds of thousands of pages, that is not always feasible.
  • Q: How do engines interpret things like AJAX, JavaScript, etc.?
  • A: Eytan answered that if webmaster wants things interpreted, they are going to have to represent those in a format the engine can understand, AJAX and JavaScript currently not being one of them.
  • Q: Question regarding rankings in Yahoo! disappearing for three weeks but then they get back in. Is his due to an update?
  • A: Sean answers that it certainly could be and suggests using Site Explorer to see if there is some kind of issue.
  • Q: How many links will engines actually crawl per page? How much is too much?
  • A: Peter says there is no hard and fast rule but keep the end user in mind. Evan echoes the same feeling.
  • Q: Do the engine use meta descriptions?
  • A: All engines use them and may use them if the algorithm feels they are relevant.
  • Q: For sites that are designed completely in Flash, can you use content in a "noscript" tag or would that be considered as some type of cloaking?
  • A: Sean said IP delivery is a no-no but if the content is the same as Flash, he’d rather see content in noscript than traditional cloaking. Evan suggests avoiding sites in complete Flash but rather use Flash components.
  • Q: Is meta keywords tag still relevant?
  • A: Microsoft – no, Yahoo! – not really, Google – not really, and Ask – not really. All read it but it is has so little bearing. For a really obscure keyword where it only appears in the keyword tag and no where else on the web, Yahoo! and Ask are the only ones that will show a search result based on it.
  • Q: How do engines view automated submission/ranking software?
  • A: Evan – don’t use them.

Comments

Tag:

SES – Meet The Crawlers
Comments Off
About Navneet Kaushal
Nav is the founder and CEO of PageTraffic, a premier search engine company known for its assured SEO service, web design and development, copywriting and full time SEO professionals.

Navneet has wide experience in natural search engine optimization, internet marketing and PPC campaigns. He is a prolific writer and his articles can be found in the "Best Articles" section of many websites and article banks. As a search engine analyst , he has over 9 years of experience and his knowledge is in application here. WebProNews Writer
Top Rated White Papers and Resources

Comments are closed.