Here is a technique that has been know to work pretty well. Takes a bit to set up but it TRAPS bots that do not respect or check for the robots.txt file. If this is done properly you should not need to have a ridiculously long robots.txt file. http://danielwebb.us/software/bot-trap/ Good post folks.... there is nothing more frustrating to see a robot on your site sucking up bandwidth and find out it was created in some college computer science course as a tutorial !! -- ZZ
yes bot-trap is interesting one and i liked the Idea But only the las FAQ is a little bad I'm going to install it and let you know if it work fine or not.But DID any one test it, already?
I am using bot-trap. It works well, but you have to be careful with it. Add this near the top of your .htaccess file: Allow from 127.0.0.1 Allow from 65.55 # MSN Allow from 66.249 # Google Allow from 67.195 # Yahoo! Allow from 72.30 # Yahoo! Allow from 74.6 # Yahoo! Allow from 122.152.129.15 # Baidu Code (markup): This will keep you from banning IP addresses that you really don't want to ban.