FriendFeed Offers Real-Time SearchResults Actually Roll In
For a while now webmasters have fretted over why all of the pages of their website are not indexed. As usual there doesn't seem to be any definite answer. But some things are definite, if not automatic, and some things seem like pretty darn good guesses.
So, we scoured the forums, blogs, and Google's own guidelines for increasing the number of pages Google indexes, and came up with our (and the community's) best guesses. The running consensus is that a webmaster shouldn't expect to get all of their pages crawled and indexed, but there are ways to increase the number.
PageRank
It depends a lot on PageRank. The higher your PageRank the more pages that will be indexed. PageRank isn't a blanket number for all your pages. Each page has its own PageRank. A high PageRank gives the Googlebot more of a reason to return. Matt Cutts confirms, too, that a higher PageRank means a deeper crawl.
Links
Give the Googlebot something to follow. Links (especially deep links) from a high PageRank site are golden as the trust is already established.
Internal links can help, too. Link to important pages from your homepage. On content pages link to relevant content on other pages.
Sitemap
A lot of buzz around this one. Some report that a clear, well-structured Sitemap helped get all of their pages indexed. Google's Webmaster guidelines recommends submitting a Sitemap file, too:
· Tell us all about your pages by submitting a Sitemap file; help us learn which pages are most important to you and how often those pages change.
That page has other advice for improving crawlability, like fixing violations and validating robots.txt.
Some recommend having a Sitemap for every category or section of a site.
Speed
A recent O'Reilly report indicated that page load time and the ease with which the Googlebot can crawl a page may affect how many pages are indexed. The logic is that the faster the Googlebot can crawl, the greater number of pages that can be indexed.
This could involve simplifying the structures and/or navigation of the site. The spiders have difficulty with Flash and Ajax. A text version should be added in those instances.
Google's crawl caching proxy
Matt Cutts provides diagrams of how Google's crawl caching proxy at his blog. This was part of the Big Daddy update to make the engine faster. Any one of three indexes may crawl a site and send the information to a remote server, which is accessed by the remaining indexes (like the blog index or the AdSense index) instead of the bots for those indexes physically visiting your site. They will all use the mirror instead.
Verify
Verify the site with Google using the Webmaster tools.
Content, content, content
Make sure content is original. If a verbatim copy of another page, the Googlebot may skip it. Update frequently. This will keep the content fresh. Pages with an older timestamp might be viewed as static, outdated, or already indexed.
Staggered launch
Launching a huge number of pages at once could send off spam signals. In one forum, it is suggested that a webmaster launch a maximum of 5,000 pages per week.
Size matters
If you want tens of millions of pages indexed, your site will probably have to be on an Amazon.com or Microsoft.com level.
Know how your site is found, and tell Google
Find the top queries that lead to your site and remember that anchor text helps in links. Use Google's tools to see which of your pages are indexed, and if there are violations of some kind. Specify your preferred domain so Google knows what to index
FriendFeed Offers Real-Time Search
85 Comments
Good article
useful info, thanks!
good information
Thanks dude for such a nice information. This will really help to seo like me.
PageRank
I can donfirm that PageRank will effect how many pages are indexed. The higher your pagerank the deeper googlebot will crawl. That means both on a site wide basis as well as on a per page basis.
Great Article
Jason,
Great article. You are very right, at the end of the day it is all about content.
Indexing
As a newbie in SEO, there could be a lot of things to know more. It is easy to say doing that than actually do it. I am still in experiment how things work well in this category of page indexing. But thanks for this post. Got something new.
SEO Friendly Web Design
Good tips! Design a search engine friendly site is the key. A lot of people think that spiders can sort out what is content and what is html, javascript or css codes. It is true, but if you make the job easier for spiders, your site gets much better indexed. One very useful technique is to create good internal links that guide spiders from one page to another. This is often ignored by most web designers. A sitemap can help. However, if a site is well designed for spider to crawl, you may not need the sitemap if the site is relatively small and not a huge ecommerce site.
great
That's some great info Jason.
IT freelance
"Some blogger claimed he could get a post indexed in 10 minutes, not sure how that works though!"
In my experience new articles are indexed within 10 minutes with a WordPress blog. The blog software "pings out" (not sure on the correct term) that a new article has been posted -> Google arrives 10 minutes later.
Not really
I had a Wordpress blog at wp.com before. IN fact I had 2 accounts. The one got PR 2 within 2 months. Whenever I posted a new post, I saw it in Google within 5 minutes. On the other hand with my blog with my second account, it did not have any PR (not even PR0), and whenever I posted a new post, it only reached the search engine within an hour's time. And it does not even get very far in the SERPS because of the ranking (had just 2 or 3 baclinks) so it was just a waste of time anyway.
Fast ways to make cash
Pagerank?
So many people say the pagerank means nothing, yet it's interesting that Google keeps it around. You may be correct that it's used to determine the depth that they crawl a site, thanks!
Google index
I have uploaded sitemap for my website in google webmaster tool. my sitemap can be found at www.example.com/sitemap.xml. 1938 urls in my sitemap.Now sitemap status shows: indexed url is only 6. Why google not index my sitemap fully? Does anyone know how to solve this problem?
Check my page
Hi..
I am new to this website,,
What i need to know is how to promote my website or business.
http://www.professional-mover.co.uk
Can anyone can help me to forward my website to your freinds and family.
I need to promote my business.
http://www.professional-mover.co.uk
Any help if you can that will be always i remember, If you think i need to change anything to more people and like my page kindly advise..
http://www.professional-mover.co.uk
Best regards
thanks for post
thanks for post
External Links
Launching a large number of links and content all at once doesn't seem to have as large an impact as continuous, steady additions.
Excellent piece
The elements covered here are core to good SEO practice, especially the inclusion of a sitemap. Sitemaps seem to have wained in popularity recently - not sure why as along with relevant contextual backlinks, they still play their part.
Thanks
Page size and clutter is definitely important
The less junk that Google has to cut through in the HTML, the better. From my experience, this is especially true of sites have been handled by multiple designers and not run through validators recently. Too much clutter = inability to crawl.
Internal link is important
i think internal linking is important to guide crawler on site.
thanks for your article.
thanks for your article. Very help me. I will more like visit to webpronews site. :) Fantastic
I had submitted my sitemap
Good article
very usefull article.
thank
thanks for your article.
thanks for your article. It's very hope me.
Thanks Jason
Thanks for the great info about increaing pages indexed.. Look foward to reading more of your articles
It's definitely all about
It's definitely all about unique content.
great article jason
Another great article jason, I have enjoyed the reading.
Indexing in 10 minutes
Those tips look good. I just started a Blogger blog yesterday, added url to google and submitted a sitemap and am awaiting some action. I think i will follow the pointer about submitting a couple of new post urls, and also get on to Yahoo and other search engines.
Some blogger claimed he could get a post indexed in 10 minutes, not sure how that works though!
Tips
Una bella lista di suggerimenti_ Mi sarà davvero molto utile da qui in avanti!
Well Done
A lot of key suggestions.. really a nice guide!
some good advice here thanks
some good advice here thanks
Vietnam travel guide online 24/7
vietnam, vietnam tourism, vietnam travel, vietnam travelling, vietnam tourist, vietnam hotels, vietnam restaurants, vietnam hotel, vietnam restaurant, travelling to vietnam, travelling in vietnam, travel guide, travel tips, travel, tourist, destination, tourists, tourism, culture, hotels, hotel, food, restaurant, restaurants, eating and drink, tour, tours, festival, shopping, travel guide, travelling and voyages, voyages, holiday and vacation, customs, vietnamese customs, culture tour, vietnam culture, cultural, culture, adventure tour, trekking tour, kayaking tour, beautiful site, beautiful spot, tourist attraction, tourist spot, sightseeing
Post new comment