iEntry 10th Anniversary RSS Newsletter Advertising
Join the WebProWorld Forum!
Text: Decrease Font Size Increase Font Size | Print Print Article | Share: Delicious Digg StumbleUpon Post to Twitter Post to Facebook
CommentWednesday, March 31, 2004

Disabling Google and Other Search Engines From Crawling a Site

17 Comments

No Follow Tag

Please suggest me, i have studied that a nofollow link is used to disallow google robot.

is it rel='nofollow'?

Robots.txt

What it has to do if i want to stop GOOGLE to jump links.. i mean i give some links and i want to disallow google to go there.

Thanks Shari

Thanks for sharing, some good pints must be helpful for all of us. Keep writing.

I have a Question

In the article you are discussing about bandwidth. Is Google use our site bandwidth is that's you mean?

I have a Question

In the article you are discussing about bandwidth. Is Google use our site bandwidth is that's you mean?

Great post ..... its helpful

Great post ..... its helpful for new guys …

________________
Mark
Moving pictures

I like the article

I like the NoFollow tags, i knew it but the article explained and removed some doubts. great

Robots.txt and NoFollow

Robots.txt is for folders, and NoFollow tag is for Outgoing Links... Itz pretty clear now. thanks

(WPN reader)

Nice way to disallow robots. Thanks to share Mam

I block google from

I block google from accessing pages on my site such as "about us" etc. It doesn't really help me in any way for those pages to get indexed.

Also, some webmasters who post duplicate content can also benefit from not allowing googlebot to index their page. You can have thousands of pages on your site with dup content but place adsense on them. So when people get on your site through pages that are indexed, they can access the other pages with dup content on them.

Good for test sites.

Its very wise to block google and other/or all search engines from crawling your 'test' site to avoid duplicate contents which may hurt your search engine rankings.

Using robots.txt helps fine tune sections on your website which may not be of interest to readers and also helps on the bandwidth.

For a test site, use the following inside robots.txt file and place them inside your home directory:-

User-agent: *
Disallow: /

Then go to any free online sitemap generator such as xml-sitemaps.com and generate a sitemap for your site. Check and see if it works. The sitemap should not return any links from your site.

Beaded Table Placemats, Beaded Purse, Beaded Pencil Case and Borneo Crafts & Gifts

Privacy

There are lots of reasons a site owner might want to keep a search engine out of part of his site. Thanks for the great tutorial.

Disabling Google and Other Search Engines From Crawling a Site

I also don`t understand why would someone want Google not to index his web site, when google is one of the best source of traffic.

crwaling of a site

is it possible to crwal a spider from a specific search engine like google only??

Robots.txt

Robots.txt is the preferred way of disabling search engines from crawling a site. But for blogs where most of the times access to root directory is not provided the job becomes tedious. it is especiallly true if you have to hide some of your pages and show others.

Get more information on Organic search Engine optimization and how to get more hits on a blog through this ?

Sandy

How to increase Adsense income?

Great topic

I was a bit cofused on the topic (I am still cofused a bit). why on the earth a webmaster will choose not to crawl his pages when google is one of the best source of traffic.I read somewhere that if you have many forms of same content  (such as html form and orint form) then you apply it but applying robots.txt to whole of the site is absurd i thin.

Delhi india & Delhi travel guide

What legal right does Google have to crawl my website?

Hi,

While most discussion on the web is about how to get a website IN to Google, I recently had a client who did NOT want to be listed in Google (for whatever reason!).

So, he asked me, "Does Google have the legal right to crawl my website and list information in its search engine?"

My response was to tell him that once a website is created online it becomes part of the public domain and therefore search engines are entitled to (and do) visit the site with their crawlers, spiders, robots, etc.

Is this the correct legal answer?

Tony

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
1 + 15 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
SEARCH
Popular WPN Business Resources












Subscribe to WebProNews


Send me relevant info