So I was skimming through weblogs to check bots we are blocking and I see a *ton* of these... 67.212.163.122 - - [04/Nov/2009:16:31:58 -0800] "GET /forumdisplay.php?f=35&page=29 HTTP/1.1" 403 - "-" "PHP/5.2.9" 67.212.163.122 - - [04/Nov/2009:16:32:00 -0800] "GET /forumdisplay.php?f=35&page=18 HTTP/1.1" 403 - "-" "PHP/5.2.9" 67.212.163.122 - - [04/Nov/2009:16:32:00 -0800] "GET /forumdisplay.php?f=35&page=17 HTTP/1.1" 403 - "-" "PHP/5.2.9" 67.212.163.122 - - [04/Nov/2009:16:32:00 -0800] "GET /forumdisplay.php?f=35&page=16 HTTP/1.1" 403 - "-" "PHP/5.2.9" Code (markup): So being curious about who might be spidering thousands of pages of this site with PHP, I did a reverse DNS lookup on the IP address: b3091196.crawl.yahoo.net Going further to check ownership of the IP address, it's indeed Yahoo... http://ws.arin.net/whois/?queryinput=67.195.112.124 So why in the hell are they running (at least a portion of) their spiders on PHP without even bothering to change the user agent? Come on guys... at least take the time to do this: ini_set ('user_agent', 'Yahoo! Slurp/4.0; We are a sucky search engine, help Microsoft!'); PHP: