What is robot.txt and how its works?

sachin.coolboy Peon

Messages:: 75

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#1

What is robot.txt and how its works ?

sachin.coolboy, Mar 8, 2009 IP

suhaana@maxinspire.co.in Peon

Messages:: 91

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#2

I have summed up all the robot txt info
read here http:// forums.digitalpoint (.com) (showthread.php?t=1259401)

suhaana@maxinspire.co.in, Mar 8, 2009 IP

measure9inva Peon

Messages:: 1

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#3

suhaana@maxinspire.co.in said: ↑

I have summed up all the robot txt info
read here http:// forums.digitalpoint (.com) (showthread.php?t=1259401)
Click to expand...

Thank you.....

measure9inva, Mar 8, 2009 IP

sachin.coolboy Peon

Messages:: 75

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#4

suhaana@maxinspire.co.in said: ↑

I have summed up all the robot txt info
read here http:// forums.digitalpoint (.com) (showthread.php?t=1259401)
Click to expand...

thank you....

sachin.coolboy, Mar 8, 2009 IP

rena Peon

Messages:: 1,987

Likes Received:: 13

Best Answers:: 0

Trophy Points:: 0

#5

sachin.coolboy said: ↑

What is robot.txt and how its works ?
Click to expand...

Its a method to tell Google for indexing site.. especially used for not indexing ( not crawl) some fold or some pages in the site.. If go to Google webmaster tool get clear picture with example

rena, Mar 8, 2009 IP

prashantban Well-Known Member

Messages:: 1,202

Likes Received:: 10

Best Answers:: 0

Trophy Points:: 100

#6

Thanx alot for this...

prashantban, Mar 8, 2009 IP

HanhVu Banned

Messages:: 123

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 0

#7

Hope this helpful for you: http://webmasterseoblog.com/tag/robotstxt/

HanhVu, Mar 8, 2009 IP

Canonical Well-Known Member

Messages:: 2,223

Likes Received:: 141

Best Answers:: 0

Trophy Points:: 110

#8

google.com/support/webmasters/bin/answer.py?hl=en&answer=40360

Basically, it's a file that lives in the root of your web that friendly crawlers use to determine which URLs on your site they should NOT index. By default, if it does not exist they assume any page they can find on your site is available for indexing.

NOTE: Bad crawlers will frequently ignore your robots.txt and index whatever they can find.

Canonical, Mar 8, 2009 IP

vengatowen Well-Known Member

Messages:: 568

Likes Received:: 10

Best Answers:: 0

Trophy Points:: 170

#9

A robots.txt file restricts access to your site by search engine robots that crawl the web. These bots are automated, and before they access pages of a site, they check to see if a robots.txt file exists that prevents them from accessing certain pages. (All respectable robots will respect the directives in a robots.txt file, although some may interpret them differently. However, a robots.txt is not enforceable, and some spammers and other troublemakers may ignore it. For this reason, we recommend password protecting confidential information.)

You need a robots.txt file only if your site includes content that you don't want search engines to index. If you want search engines to index everything in your site, you don't need a robots.txt file (not even an empty one).

While Google won't crawl or index the content of pages blocked by robots.txt, Google may still index the URLs if Google find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results.

In order to use a robots.txt file, you'll need to have access to the root of your domain (if you're not sure, check with your web hoster). If you don't have access to the root of a domain, you can restrict access using the robots meta tag.

I hope you can understand the robot.txt file and its use.

vengatowen, Mar 8, 2009 IP

mrandrei Peon

Messages:: 1,133

Likes Received:: 9

Best Answers:: 0

Trophy Points:: 0

#10

Robots.txt file is a set of instructions for visiting robots or spiders that index the content of a site.

mrandrei, Mar 8, 2009 IP

resaik_king Active Member

Messages:: 1,049

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 80

#11

I guess this one could help

http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=40360

resaik_king, Mar 8, 2009 IP

tung148 Active Member

Messages:: 32

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 61

#12

More http://www.robotstxt.org

tung148, Mar 8, 2009 IP

SabQat Peon

Messages:: 675

Likes Received:: 3

Best Answers:: 0

Trophy Points:: 0

#13

a simple .txt file made using note pad like editor giving direction to google & other search engines which page to crawl or which not.

SabQat, Mar 9, 2009 IP

gred Member

Messages:: 30

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 43

#14

one more source
http://en.wikipedia.org/wiki/Robots.txt

gred, Mar 12, 2009 IP

Lovely Well-Known Member

Messages:: 2,997

Likes Received:: 18

Best Answers:: 0

Trophy Points:: 155

#15

You can read http://www.robotstxt.org/

Lovely, Mar 20, 2009 IP

Seo_genius Member

Messages:: 240

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 35

#16

Robot. txt tells the search engine which part of your website not to index. It basically cut off pucblic access to places like your website databse, shopping carts and other private information you don't want available to the general public. I am sure you have got very good recommendations earlier, However, i hope this helps further.

Br
Seo_genius

Seo_genius, Mar 20, 2009 IP

mmerlinn Prominent Member

Messages:: 3,197

Likes Received:: 819

Best Answers:: 7

Trophy Points:: 320

#17

I don't see any value in a separate file for instructing robots. All you need to do is use the robots META tag on every page that you want to restrict.

Since you usually want over 99% of your pages indexed, setting the robots META tag to nofollow or noindex on your restricted pages should not be cumbersome at all.

I have a large site (over 4000 pages) and there are only about 200 of those pages needing to be restricted. Adding a simple META tag to those page cured all GOOGLE problems. However, neither a robots.txt nor a META tag will stop rogue bots.

mmerlinn, Mar 20, 2009 IP

Log in or Sign up

What is robot.txt and how its works?

sachin.coolboy Peon

suhaana@maxinspire.co.in Peon

measure9inva Peon

sachin.coolboy Peon

rena Peon

prashantban Well-Known Member

HanhVu Banned

Canonical Well-Known Member

vengatowen Well-Known Member

mrandrei Peon

resaik_king Active Member

tung148 Active Member

SabQat Peon

gred Member

Lovely Well-Known Member

Seo_genius Member

mmerlinn Prominent Member

Useful Searches