My robots.txt file, is it too silly example?

papek Peon

Messages:: 92

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#1

What sort of a traffic amount would I miss if I kept this robots.txt file on my site? Shame it won't show in stats right away so I am trying to figure out how many small crawlers would drop the website from their index. Before I was disabling only but the list was growing too long.

User-agent: Mediapartners-Google*
Disallow:

User-Agent: ArchitextSpider # Excite
User-Agent: Ask Jeeves
User-Agent: FAST-WebCrawler
User-Agent: Freecrawl # euroseek.net
User-Agent: Googlebot
User-Agent: Googlebot-Mobile
User-Agent: Googlebot-Image
User-Agent: Adsbot-Google
User-Agent: Gulliver # Northern Light
User-Agent: ia_archiver
User-Agent: InfoSeek
User-Agent: Lycos
User-Agent: msnbot
User-Agent: Scooter
User-Agent: Slurp
Disallow:

User-Agent: *
Disallow: /

papek, Jun 7, 2007 IP

trichnosis Prominent Member

Messages:: 13,785

Likes Received:: 333

Best Answers:: 0

Trophy Points:: 300

#2

you will not recieve traffic to your site, if you keep this file

trichnosis, Jun 8, 2007 IP

papek Peon

Messages:: 92

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#3

Thanks trichnosis I changed it back to what it was before. BUT I thought all the crawlers from the top were allowed, except the last "/" line as disallowed.

papek, Jun 8, 2007 IP

DavidK1 Peon

Messages:: 507

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#4

but the line before that is saying ALL with *. So you are basically saying allow all those ones you have listed, but then disallow all.

I assume you are trying to keep all other robots from visiting. Remember that only "good" robots will "listen" to a robots.txt, and you have those listed.

If a certain bot is causing you issues, just ban it in your .htaccess

DavidK1, Jun 11, 2007 IP

Jean-Luc Peon

Messages:: 601

Likes Received:: 30

Best Answers:: 0

Trophy Points:: 0

#5

papek said: ↑

I thought all the crawlers from the top were allowed, except the last "/" line as disallowed.
Click to expand...

You are perfectly right.

Jean-Luc

Jean-Luc, Jun 11, 2007 IP

DavidK1 Peon

Messages:: 507

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#6

Jean-Luc said: ↑

You are perfectly right.

Jean-Luc
Click to expand...

Ummm.. No he isn't.

DavidK1, Jun 11, 2007 IP

Jean-Luc Peon

Messages:: 601

Likes Received:: 30

Best Answers:: 0

Trophy Points:: 0

#7

DavidK1 said: ↑

but the line before that is saying ALL with *. So you are basically saying allow all those ones you have listed, but then disallow all.
Click to expand...

This is not correct.

"User-Agent: *" means all other robots, not mentioned in another rule. This is explained in the original robots.txt specification:

If the value is '*', the record describes the default access policy for any robot that has not matched any of the other records.
Click to expand...

Jean-Luc

Jean-Luc, Jun 11, 2007 IP

DavidK1 Peon

Messages:: 507

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#8

Jean-Luc said: ↑

This is not correct.

"User-Agent: *" means all other robots, not mentioned in another rule. This is explained in the original robots.txt specification:

Jean-Luc
Click to expand...

Nope. It does not work that way, and the link you gave does not say that.

DavidK1, Jun 11, 2007 IP

Jean-Luc Peon

Messages:: 601

Likes Received:: 30

Best Answers:: 0

Trophy Points:: 0

#9

DavidK1 said: ↑

the link you gave does not say that.
Click to expand...

What does it say, then ?

Jean-Luc

Jean-Luc, Jun 11, 2007 IP

papek Peon

Messages:: 92

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#10

Thanks guys. I am watching this and for now I use back on my previouse version of robots.txt file containing list of only disabled robots. I am adopting more knowledge about this!

The file in this thread was used only for 2 weeks and I noticed 5% decrease in organic traffic; hard to say the reason as it well could be just arriving summer months.

papek, Jun 11, 2007 IP

DavidK1 Peon

Messages:: 507

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#11

DavidK1 said: ↑

Nope. It does not work that way, and the link you gave does not say that.
Click to expand...

It doesn't work that way. Try it for yourself. Set up a robots.txt file like the example given. Then use one of those spider simulators or page crawlers that allow you to set the User agent to whatever you wish. Then take note of what happens.

You are also forgetting that only "good" bots like the ones that are listed will comply with a robots.txt command. The ones that cause trouble have to be blocked via .htaccess

DavidK1, Jun 12, 2007 IP

Jean-Luc Peon

Messages:: 601

Likes Received:: 30

Best Answers:: 0

Trophy Points:: 0

#12

DavidK1 said: ↑

It doesn't work that way. Try it for yourself. Set up a robots.txt file like the example given. Then use one of those spider simulators or page crawlers that allow you to set the User agent to whatever you wish. Then take note of what happens.
Click to expand...

I rely on the spec, not on the (maybe invalid) design of a spider simulator.

DavidK1 said: ↑

You are also forgetting that only "good" bots like the ones that are listed will comply with a robots.txt command. The ones that cause trouble have to be blocked via .htaccess
Click to expand...

I do not forget that. I fully agree with you on the need to use .htaccess for bad-intended bots.

Jean-Luc

Jean-Luc, Jun 12, 2007 IP

DavidK1 likes this.

DavidK1 Peon

Messages:: 507

Likes Received:: 16

Best Answers:: 0

Trophy Points:: 0

#13

Jean-Luc said: ↑

I rely on the spec, not on the (maybe invalid) design of a spider simulator.
Click to expand...

There are a lot of them out there, they are all invalid? When I disallow all, the tools cannot crawl. Considering they aren't Googlebot MSN or Slurp, it shouldn't allow them.. but it does.

The spec you are relying on is not in control of any of the bots out there. It is a guide on how "well-behaved" bots work.

DavidK1, Jun 12, 2007 IP

Log in or Sign up

My robots.txt file, is it too silly example?

papek Peon

trichnosis Prominent Member

papek Peon

DavidK1 Peon

Jean-Luc Peon

DavidK1 Peon

Jean-Luc Peon

DavidK1 Peon

Jean-Luc Peon

papek Peon

DavidK1 Peon

Jean-Luc Peon

DavidK1 Peon

Useful Searches