View Full Version : psbot and gigabot wont stop even when banned from robots.txt
SaN-DeeP
Feb 28th 2006, 1:31 pm
any possible solutions to get rid of this bots permanently...
i find around 100+ spiders crawling on the entire server at any time !
mkeen
Feb 28th 2006, 2:03 pm
They ignored my robots.txt also.
I ended up putting this at the top of my sites, its PHP btw hope it helps.
$checknaughty = $_SERVER['HTTP_USER_AGENT'];
if($checknaughty == "psbot/0.1 (+http://www.picsearch.com/bot.html)") {
echo "PISS OFF PSBOT STOP IGNORING robots.txt, your clearly disallowed now learn to read";
die();
}
SaN-DeeP
Feb 28th 2006, 2:24 pm
thanks Matt,
asking myself whats the use of robots.txt if spiders dont honor them...
wkd
Mar 1st 2006, 11:41 am
For Gigablast you could also add this to the page:
<meta name="gigabot" content="noindex,nofollow" />
SaN-DeeP
Mar 1st 2006, 2:56 pm
gigabot has completely stopped after disallowing from robots.txt but psbot sucks hightime.... i should say !
thanks for all your comments above :)
Nintendo
Mar 1st 2006, 5:18 pm
You can't ban any one using robots.txt. That's only a suggestion, like saying 'Please don't go here, but we can't stop you.'
In the script, or .htaccess are probably the only ways to ban them.
Carl Sarnstrand
Jun 19th 2007, 5:30 am
Picsearch search spiders always respect robot.txt and we will immidiately address any problems that you inform us about. Please send an e-mail to info@picsearch.com and we will see that any problems get handled as soon as possible.
Please go to http://www.picsearch.com to try our service.
Picsearch takes robot.txt seriously and has a short text on our website at http://www.picsearch.com/menu.cgi?item=Psbot.
Best Regards
Carl Sarnstrand
Communications Manager
Picsearch
trichnosis
Aug 6th 2007, 7:07 am
blocking the ips of those bots is a better opion.
both are not following the robots.txt
vBulletin® v3.6.8, Copyright ©2000-2008, Jelsoft Enterprises Ltd.