e Harmony - Problem Mortgage - Buy Wow Gold - Mortgages - Myspace Codes

PDA

View Full Version : psbot and gigabot wont stop even when banned from robots.txt


SaN-DeeP
Feb 28th 2006, 1:31 pm
any possible solutions to get rid of this bots permanently...
i find around 100+ spiders crawling on the entire server at any time !

mkeen
Feb 28th 2006, 2:03 pm
They ignored my robots.txt also.

I ended up putting this at the top of my sites, its PHP btw hope it helps.

$checknaughty = $_SERVER['HTTP_USER_AGENT'];
if($checknaughty == "psbot/0.1 (+http://www.picsearch.com/bot.html)") {
echo "PISS OFF PSBOT STOP IGNORING robots.txt, your clearly disallowed now learn to read";
die();

}

SaN-DeeP
Feb 28th 2006, 2:24 pm
thanks Matt,
asking myself whats the use of robots.txt if spiders dont honor them...

wkd
Mar 1st 2006, 11:41 am
For Gigablast you could also add this to the page:

<meta name="gigabot" content="noindex,nofollow" />

SaN-DeeP
Mar 1st 2006, 2:56 pm
gigabot has completely stopped after disallowing from robots.txt but psbot sucks hightime.... i should say !

thanks for all your comments above :)

Nintendo
Mar 1st 2006, 5:18 pm
You can't ban any one using robots.txt. That's only a suggestion, like saying 'Please don't go here, but we can't stop you.'

In the script, or .htaccess are probably the only ways to ban them.

Carl Sarnstrand
Jun 19th 2007, 5:30 am
Picsearch search spiders always respect robot.txt and we will immidiately address any problems that you inform us about. Please send an e-mail to info@picsearch.com and we will see that any problems get handled as soon as possible.

Please go to http://www.picsearch.com to try our service.

Picsearch takes robot.txt seriously and has a short text on our website at http://www.picsearch.com/menu.cgi?item=Psbot.

Best Regards

Carl Sarnstrand
Communications Manager
Picsearch

trichnosis
Aug 6th 2007, 7:07 am
blocking the ips of those bots is a better opion.

both are not following the robots.txt