Please give me a robots.txt file to exclude as many search engines as possible!!

jgjg Peon

Messages:: 595

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 0

#1

Hi there...I was wondering if anyone has a sample of a good robots.txt file to put on my server to make sure no pages get spidered. (By as many search engines as possible)

I had a developement site set up and somehow it got indexed (the dev site wasn't at index.html) I had some feedback from a forum post I think so it got spidered.

Also...Once I put this up will Google delist it next time it crawls?

Thanks.

jgjg, May 18, 2007 IP

kentuckyslone Notable Member

Messages:: 4,371

Likes Received:: 367

Best Answers:: 0

Trophy Points:: 205

#2

SOmeone correct me if I am wrong, but I thnk this is all you need:

User-agent: *
Disallow: *

kentuckyslone, May 18, 2007 IP

sweetfunny Banned

Messages:: 5,743

Likes Received:: 467

Best Answers:: 0

Trophy Points:: 0

#3

Disallow: /

You got the useragent right. The / means root.

sweetfunny, May 18, 2007 IP

kentuckyslone Notable Member

Messages:: 4,371

Likes Received:: 367

Best Answers:: 0

Trophy Points:: 205

#4

DOH! I knew that. I dont know what I was thinking when I typed that in. The wild card isnt used in the disallow part but the url, or directory is.

Thanks for correcting me on that silly mistake.

So all you need in the robots.txt is

User-agent: *
Disallow: /

kentuckyslone, May 18, 2007 IP

Dudibob Peon

Messages:: 618

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 0

#5

just remember that not all robots obey the robots.txt file

Dudibob, May 18, 2007 IP

seoperson Peon

Messages:: 501

Likes Received:: 9

Best Answers:: 0

Trophy Points:: 0

#6

yes u need only :
user-agent: *
Disallow: /

i agree with above

seoperson, May 18, 2007 IP

hmansfield Guest

Messages:: 7,904

Likes Received:: 298

Best Answers:: 0

Trophy Points:: 280

#7

Is this firm, or do some spiders crawl what ever the hell they want?

hmansfield, May 18, 2007 IP

panerai Active Member

Messages:: 179

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 58

#8

not all robots obey the robot.txt file. i have a dev site also that's listen on noname search engines. it gets traffic too lol

panerai, May 18, 2007 IP

kentuckyslone Notable Member

Messages:: 4,371

Likes Received:: 367

Best Answers:: 0

Trophy Points:: 205

#9

That is true, Not all robots will obey such 'commands' as dissallow or even rel="nofollow". There really is no way to absolutely guarantee 100% that your pages will not be crawled

kentuckyslone, May 18, 2007 IP

hmansfield Guest

Messages:: 7,904

Likes Received:: 298

Best Answers:: 0

Trophy Points:: 280

#10

That's what I kind of figured on my own, after checking the file a million times.
Thanks for confirming.

hmansfield, May 18, 2007 IP

jgjg Peon

Messages:: 595

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 0

#11

do you think the search engines may delist the site? or just not update it?

jgjg, May 18, 2007 IP

jgjg Peon

Messages:: 595

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 0

#12

also does google obey robots.txt?

jgjg, May 20, 2007 IP

Christine8 Peon

Messages:: 257

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#13

Google will listen to the robots.txt. However, it may take quite awhile until the site gets dropped from the index.

Christine8, May 20, 2007 IP

Log in or Sign up

Please give me a robots.txt file to exclude as many search engines as possible!!

jgjg Peon

kentuckyslone Notable Member

sweetfunny Banned

kentuckyslone Notable Member

Dudibob Peon

seoperson Peon

hmansfield Guest

panerai Active Member

kentuckyslone Notable Member

hmansfield Guest

jgjg Peon

jgjg Peon

Christine8 Peon

Useful Searches