There are a lot of files on my forum that I don't want all search engine spiders to visit so I have them listed as: User-agent: * Disallow: /admincp/ Disallow: /attachments/ Disallow: /clientscript/ Additionally, there is one file that I want spidered by other spiders except for Googlebot. So I have added something like: User-agent: googlebot Disallow: /arcade.php So my robots.txt file may look something like this: User-agent: * Disallow: /admincp/ Disallow: /attachments/ Disallow: /clientscript/ User-agent: googlebot Disallow: /arcade.php PHP: Does that mean that googlebot will not spider anything in /admincp, /attachments, /clientscript and arcade.php or will it only listen to what is directly specified for Googlebot? Meaning, will it only choose not to index arcade.php?
Yes, that's correct (theoretically). All spiders which obey the robots.txt directives (the reputable ones do but not all of them do) will obey the first set of directives, including googlebot, and additionally googlebot will obey the second googlebot-specific directive.
No, that's incorrect. What that does is tell ALL spiders not to index anything (disallow everything). That wasn't what was requested and frankly I can't imagine anyone wanting a robots.txt file like that except perhaps for a private members only site.
Thought he was asking about google. So i added the image google too so he could add it to his list. The best thing to do is list all spiders that you don't want by name. Some are good some are not. I posted a full list of bots on one of these forums.
He was asking about Google. Look at the first line of the robots.txt file you posted: followed by That instructs all spiders to disallow everything.