View Full Version : spider specific and general rules
new
Jan 12th 2006, 7:21 pm
In my robots.txt I have two parts, one for googlebot and the other for general spiders, I want to ask that will the googlebot also look for the rules present in the robots.txt section for all bots or will it only follow the ones defined specifically for it ?
basically I want to define some global rules and some rules only for google but I also want google to follow the global rules .. I am worried that once I define some rules specifically for googlebot than it will descard the rest of the file
thanks
minstrel
Jan 12th 2006, 8:31 pm
http://www.robotstxt.org/wc/faq.html#robotstxt
You can read the whole standard specification (http://www.robotstxt.org/wc/norobots.html) but the basic concept is simple: by writing a structured text file you can indicate to robots that certain parts of your server are off-limits to some or all robots. It is best explained with an example:
# /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism
User-agent: webcrawler
Disallow:
User-agent: lycra
Disallow: /
User-agent: *
Disallow: /tmp
Disallow: /logs
The first two lines, starting with '#', specify a comment
The first paragraph specifies that the robot called 'webcrawler' has nothing disallowed: it may go anywhere.
The second paragraph indicates that the robot called 'lycra' has all relative URLs starting with '/' disallowed. Because all relative URL's on a server start with '/', this means the entire site is closed off.
The third paragraph indicates that all other robots should not visit URLs starting with /tmp or /log. Note the '*' is a special token, meaning "any other User-agent"; you cannot use wildcard patterns or regular expressions in either User-agent or Disallow lines.
Two common errors:
Wildcards are _not_ supported: instead of 'Disallow: /tmp/*' just say 'Disallow: /tmp/'.
You shouldn't put more than one path on a Disallow line
new
Jan 12th 2006, 8:55 pm
/\
But this does not ans my question ?
minstrel
Jan 12th 2006, 9:19 pm
It does if you read it carefully.
Do this:
User-agent: *
Disallow: {insert global rules here}
User-agent: Googlebot
Disallow: {insert global rules here}
Disallow: {insert Googlebot specific rules here}
Just copy the global rules into the Googlebot section and any extras you need to the Googlebot section.
new
Jan 12th 2006, 9:49 pm
Thanks, now it does clarify what I wanted to know :)
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.