What is robots.txt?

icare Peon

Messages:: 714

Likes Received:: 19

Best Answers:: 0

Trophy Points:: 0

#1

How do I get or create on for my site?

Please advise

icare, Feb 12, 2006 IP

Smyrl Tomato Republic Staff

Messages:: 13,740

Likes Received:: 1,702

Best Answers:: 78

Trophy Points:: 510

#2

Do a Google search for robots.txt tutorial. Your robots.txt file can be created with any text editor. This file spells out files that may or may not be indexed. There are many non-obedient robots out there but Google, Yahoo, and MSN all obey you robots.txt command.

These two lines allow all robots to index every page
User-agent: *
Disallow:

These two lines keep all robots out.
User-agent: *
Disallow: /

Smyrl, Feb 12, 2006 IP

icare Peon

Messages:: 714

Likes Received:: 19

Best Answers:: 0

Trophy Points:: 0

#3

Smyrl said:

Do a Google for robots.txt tutorial. It can be created with any text editor. Your robots.txt file spells out for obedient robots which files may be index.

These two lines allow all robots to index every page
User-agent: *
Disallow:

These two lines keep all robots out.
User-agent: *
Disallow: /
Click to expand...

Even I fI google it it will show DP page on very top then Y not ask here, i had tried serching this on DP but couldnt find any answere which ican explain...

icare, Feb 12, 2006 IP

Smyrl Tomato Republic Staff

Messages:: 13,740

Likes Received:: 1,702

Best Answers:: 78

Trophy Points:: 510

#4

Here is number one listing in Google.

http://www.searchengineworld.com/robots/robots_tutorial.htm

Smyrl, Feb 12, 2006 IP

Cristian Mezei Notable Member

Messages:: 3,332

Likes Received:: 355

Best Answers:: 0

Trophy Points:: 213

#5

I have this one in my bookmarks, together with this one.

It might do you good, to read them

Cristian Mezei, Feb 12, 2006 IP

dashboard Peon

Messages:: 13

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#6

you can also use <meta name=robots.txt content=index,nofollow>

dashboard, Feb 12, 2006 IP

seoaddict Peon

Messages:: 216

Likes Received:: 21

Best Answers:: 0

Trophy Points:: 0

#7

Create .txt file. And save as robots.txt
Here you can allow and disallow crawlers.

seoaddict, Feb 13, 2006 IP

mariush Peon

Messages:: 562

Likes Received:: 44

Best Answers:: 0

Trophy Points:: 0

#8

I've added a robots.txt file just to keep out the 404 not found errors. It annoyed me because I was seeing them in awstats.

My robots.txt is actualy:
User-agent: *
Disallow: /cgi-bin/
Code (markup):

mariush, Feb 13, 2006 IP

JEET Notable Member

Messages:: 3,832

Likes Received:: 502

Best Answers:: 19

Trophy Points:: 265

#9

dashboard said:

you can also use <meta name=robots.txt content=index,nofollow>
Click to expand...

That's not a ROBOTS.TXT . It's meta tags .
And neither is it right .You cannot specify a file name in meta tags .
<meta http-equiv="robots" content="index,follow" />
is the right tag for the content and links on that particular page .

Robots.txt is a simple text file which "GOOD" Crawler bots read to see which folders or files are allowed to index and which are not .
It is placed in the main host folder inside "Public_html"

User agent *
Disallow /images

will keep out search engines from your images folder .
If you want everything to be available for indexing then create an empty "robots.txt" and put it in "public_html" folder .
A blank notepad file named "robots.txt" ...

If you don't have a "public_html" folder , then probably your host already has a robots.txt and you need not do anything . Your site is a folder inside "his public_html" which already has a robots.txt .

But if you are getting a 404 not found error for robots.txt , then ask your host if he has that file . If no , then ask him to put one .

This is what I have noticed from my logs .
Hope that's right .

Regards
Jeet

JEET, Feb 13, 2006 IP

lionstarr Peon

Messages:: 276

Likes Received:: 8

Best Answers:: 0

Trophy Points:: 0

#10

JEET said:

That's not a ROBOTS.TXT . It's meta tags .
And neither is it right .You cannot specify a file name in meta tags .
<meta http-equiv="robots" content="index,follow" />
is the right tag for the content and links on that particular page .

Robots.txt is a simple text file which "GOOD" Crawler bots read to see which folders or files are allowed to index and which are not .
It is placed in the main host folder inside "Public_html"

User agent *
Disallow /images

will keep out search engines from your images folder .
If you want everything to be available for indexing then create an empty "robots.txt" and put it in "public_html" folder .
A blank notepad file named "robots.txt" ...

If you don't have a "public_html" folder , then probably your host already has a robots.txt and you need not do anything . Your site is a folder inside "his public_html" which already has a robots.txt .

But if you are getting a 404 not found error for robots.txt , then ask your host if he has that file . If no , then ask him to put one .

This is what I have noticed from my logs .
Hope that's right .

Regards
Jeet
Click to expand...

I know it as
<meta name="robots" content="index, follow">
You can say index - noindex in the first place: Allow search engines to index your site or don't.
Then you can say follow or nofollow, to disallow Search Engines giving away your PageRank
greetings,
lionstarr

lionstarr, Feb 21, 2006 IP

minstrel Illustrious Member

Messages:: 15,082

Likes Received:: 1,243

Best Answers:: 0

Trophy Points:: 480

#11

lionstarr, the meta tag you mention is not as good a solution as robots.txt for most websites:

1. it has to be used on a page by page basis, i.e., for spiders that read and honor that meta tag, it only applies to the page that contains it

2. it does not have the capability for excluding specific spiders or entire directories

The only time one normally would use the meta tag is if you are on free hosting that won't allow you to place a robots.txt file in the root directory.

minstrel, Feb 22, 2006 IP

lionstarr Peon

Messages:: 276

Likes Received:: 8

Best Answers:: 0

Trophy Points:: 0

#12

Of course it's not as good as a robots.txt!
I only saw JEET Posting about <meta http_equiv and thought I tell you that I know it as <meta name="robots"> - maybe I'm wrong and I learn something or he's wrong and learns something!

lionstarr, Feb 23, 2006 IP

Log in or Sign up

What is robots.txt?

icare Peon

Smyrl Tomato Republic Staff

icare Peon

Smyrl Tomato Republic Staff

Cristian Mezei Notable Member

dashboard Peon

seoaddict Peon

mariush Peon

JEET Notable Member

lionstarr Peon

minstrel Illustrious Member

lionstarr Peon

Useful Searches