Yahoo and MSN have gone absolute nutts on my site: www.thegoodbook.co.uk A real pain. They don't seem to obey my robots.txt either... Here are my site stats for June: 1 Yahoo Robot (www.yahoo.com) 18844 56.97% 2 MSN Robot (search.msn.com) 5458 16.50% 3 Internet Explorer 6.0 2377 7.19% 4 Internet Explorer 7.0 2269 6.86% 5 Google Robot (www.google.com) 1278 3.86% 6 Mozilla 5 824 2.49% 7 Firefox 2.0 709 2.14% 8 Safari 338 1.02% 9 Ask Jeeves Robot (www.ask.com) 253 0.76% Kind of comic, huh? Except my site is slowing right down, and wretched yahoo is to blame in good part. Unless it is my bad programming - it wouldn't be the first time sadly. But i've looked and looked and can see no wrong. Maybe my blindness, so i thought peer help would be best... Here is my robots.txt in full: # Robots.txt # slow down the mad MSN bot and the madder Yahoo bot User-agent: Slurp Crawl-delay: 99 [note - not in file: i've had this at 5, then 10, then 60, then 120 with apparently NO effect whatsoever, I then look on Yahoo site just now and they document crawl-delay: xx so it thought maybe it should just be two figures, hence current value] User-agent: msnbot Crawl-delay: 99 Disallow: /*.jpg$ Disallow: /*.mp3$ # Disallow directory /productimages User-agent: * Disallow: /bookcovers/ Disallow: /productfiles/ Disallow: /*.jpg$ Disallow: /*.mp3$ Any help would be very very gladly appreciated. Yours in SEO, tom
can anyone help? am i an idiot? is there something basic wrong with my robots.txt? I'm asking Yahoo atm and they are hopeless... didn't even read my email at first... cheers for any ANY help, even if its: "No you are not mad, your robots text looks like it SHOULD be working..."
ah ha! now it could have been Mac OS Roman text encoding the file - which is default on my Macs, however, I recall having this as a major problem with .htaccess files, so I reckon 'tis possible it is the cause of this little beast. Anyway, have changed it to Windows Latin encoding, and Unix LF - will let you know if this tames the tiger... Peace, t
I am a bit confussed by your post you seem to want to no about bots, but have browser info added in your stats, also the versions of the browsers ie. 3 Internet Explorer 6.0 2377 7.19% 4 Internet Explorer 7.0 2269 6.86% 6 Mozilla 5 824 2.49% 7 Firefox 2.0 709 2.14% :-S is this ment to be in?
Perhaps you're right on reflection i didn't need to include them - it was quicker though, and you just see how strikingly the robots are dominating my bandwidth usage...
Checked line endings, no joy. Encoding a red herring apparently. Man, Slurp! is vexing me most exceedingly. A friend and I read online that Yahoo says it accepts crawl delay values up to a MAXIMUM value of 10. So I had another eureka moment, and thought, oh MAYBE this is it! Changed values to 10 just to see if it works. And what happens. Over last few days Yahoo goes up to its highest ever reading of my site, it is absolutely driving me crazy at the minute, people ring up saying can't access your site blah blah is slow. And it is that DoG of search engine robots, slurp, eating my bandwidth (perhaps, or you can at least see why it gets the blame). Total request stats: Friday 21st: 1 Yahoo Robot 2580 71.53% 3 MSN Robot 339 9.40% 5 Google Robot 102 2.83% Saturday 22nd: 1 Yahoo Robot 2901 74.52% 2 MSN Robot 401 10.30% 5 Google Robot 108 2.77% Sunday 23rd: 1 Yahoo Robot 2165 69.79% 2 MSN Robot 352 11.35% 4 Google Robot 122 3.93% Madness! Will get on to Yahoo again directly, and try and get them to check their software/robot... My robots.txt is OK and should work shouldn't it? Someone say "yes" otherwise I still might think I'm insane! lol
I was about to whinge about the same thing! Yahoo is also going crazy with my sites. Another really irritating thing is that it constantly changes IP address, like some Chinese scraper bot, excuse my stereotyping, which is really annoying for trying to track certain unique visitor trends... I notice Yahoo bot hits have quadrupled since April!! What gives, did they up their capacity or something? On the plus side, I also see a huge increase in Yahoo referrals, so I suppose I can't grumble too much...
Well now. No replies from Yahoo. And they are still going crazy on my site, in spite of clear robots.txt telling them not to. MSN is also unchanged. July 2nd: 1 Yahoo Robot 1876 62.08% 3 MSN Robot 378 12.51% 5 Google Robot 85 2.81% 3rd: 1 Yahoo Robot 1919 61.98% 3 MSN Robot 381 12.31% 5 Google Robot 99 3.20% 4th: 1 Yahoo Robot 2070 65.94% 3 MSN Robot 339 10.80% 4 Google Robot 125 3.98% I'm now putting rel="nofollow" on loads and loads of my links that are irrelevant hoping this may help curtail their wildness... will post back in time no doubt... Cheers
Incidentally that they are still going crazy means they are apparently almost completely unaffected by their crawl-delay setting whether it is on 0 or 10 appears to have no impact. Just useful stuff fyi
OK guys, this is me giving up. Yahoo is unstoppable. I can't ban them totally (cos i want to appear in their results). They won't contact me. Oh, yes, I suppose I'll keep trying, what is the point in a dedicated server if yahoo robots devour your processing power and bandwidth. Though maybe this sluggishness is caused by other things, but i'm sure not helped by yahoo... Later on, if anyone has any bright ideas, please please POST...
When I put this in robots.txt : User-Agent: msnbot Crawl-Delay: 100 User-agent: Slurp Crawl-delay: 100 User-agent: slurp Crawl-delay: 100 User-agent: msnbot-products Crawl-Delay: 100 User-agent: msnbot-news Crawl-Delay: 100 User-agent: msnbot-media Crawl-Delay: 100 Code (markup): They stopped beeing a pain. Why shouldn't that work out for you?