You have to allow Google, MSN, Yahoo to crawl the whole site & disallow other bots to look into the root also.. here is the sample code.. User-agent: google Disallow: User-agent: yahoo Disallow: User-agent: msn Disallow: User-agent: * Disallow: / Code (markup): The above code will match the user agent by checking the substrings of the name of the robot of the perticular system... So that they will work perfectly fine... peace!
wrong info. i know about google. it should be User-agent: googlebot Disallow: i am not sure about yahoo and msn bots.
Hi, The only problem with using the robots.txt file, is many bots don't follow the rules and are there to simply scrape the site. You need to use a script like perl that will feed your html etc. to all users and MSN, GOOGLE and Yahoo, but will trap the others and either give them a 500 server error or redirect them some where else. I have written many scripts to do just that. Thanks, lhughes33309
Check this...This will help you in basic understanding of robots.txt http://seocrazy.blogspot.com/2008/04/robotstxt-stop.html
I think this will help all: User-agent: * Disallow: / User-agent: Googlebot Allow: / User-agent: Yahoo-slurp Disallow: User-agent: Msnbot Disallow: Code (markup):
you want to disallow all other than google, yahoo, msn... User-agent: * Disallow: / User-agent: Googlebot Allow: / User-agent: Yahoo-slurp Allow: / User-agent: Msnbot Allow: / don't you think, this is the right one...
right.. with .htaccess you can redirect the links to any desired links.. this option work in case you are having a problem of multiple links...
Yes, but you can block the BOT by ip's in .htaccess For example, if the BOT has ip: 66.249.71.xxx The command in .htaccess will be: <Limit GET POST> order deny,allow deny from 66.249.71.xxx allow from all </Limit>
this one is correct: User-agent: * Disallow: / User-agent: Googlebot Disallow: User-agent: Yahoo-slurp Disallow: User-agent: Msnbot Disallow: PHP:
Here is a version that is correct, compact, and more complete: User-agent: Googlebot User-agent: Slurp User-agent: msnbot User-agent: Mediapartners-Google* User-agent: Googlebot-Image User-agent: Yahoo-MMCrawler Disallow: User-agent: * Disallow: / Code (markup):
User-agent: * Disallow: / User-agent: Googlebot Disallow: User-agent: Yahoo-slurp Disallow: User-agent: Msnbot Disallow: This one is helpful to block all other BOT except Google, MSN, Yahoo.
you can use robots.txt, but thats not guranteed to work because some bots might not honor that. sure fire approach is detecting user agent and redirect them programatically.