I am really amazed to see the stats from google webmaster tools. I had an instruction in my robots.txt file about crawl delay. But the following is the output from Google's tools Parsing results Value Result Crawl-delay: 20 Rule ignored by Googlebot My view was that some nice companies like Google, Yahoo and MSN obey the robots.txt instructions, but this isn't happening
I've noticed this same thing. I'm starting to think that google and all the others just make up rules and then bend them when ever they feel like it.
not good, how could webmasters stop their websites being hammered by some crawling bots then i had one script from robert plank on WMW, anticrawl. but could not find something for ASP websites
I have the same issue on nodp.info, accept i havnt got robots.txt. I am hammered everyday, it is lagging the server in a way that, it takes pages twice as long to load. Rob
Unfortunately, they have probably some bugs or whatever in their software (who doesn't?). There were some big issues by ignoring some rules in robots.txt
We can't do anything about Google, Yahoo and MSN. But other i really want to block. If Someone has some nice solution for ASP based websites ?? Please send the url of software or application. We can spend around $ 200 to stop this from our site.
I had Google ignore the robots.txt as well. I purchased the domain, and instantly put a Disallow all from / (the whole site) as it was a private site for my mother and 20 of her friends. Installed the forum software, a week later Googles trying to crawl the calendar and getting the not logged in error. Confirmed it with Awestats, wasn't impressed.
It is not necessary that every bot should follow your robots.txt, Manyspammy bots ignores your robots.txt. To block those bots, you can track their IP address by your traffic logs and block them by their IP using .htaccess...