Debt Consolidation - Bet365 bonus - Debt Consolidation - ID card - Debt Consolidation

PDA

View Full Version : Odd Yahoo crawling?


aleco
Oct 27th 2004, 7:39 am
I've been checking through my logs etc and noticed the Yahoo spider has been accessing some urls that either don't exist, or have been written with mod_rewrite. The last one I found was particularly strange as it uses the .html extension to a page that doesn't exist - if I rewrite the url or make a normal static url, I either end them in .php or .htm

It's also tried to access my main url but added PREMIAIRSESSID/5bc9f0744126394912fb7/p and /h/9198rgd.html onto the end of it! :confused:

The weirdest one is /index.phtml?command=further-details&resultRow=2004-08-04|JMC||BRS|BRISTOL|KGS|Kos/Kalymnos|Self+Catering|7|25

Actually, as I write this I notice even more!
/portal/templates/ms_daydream2blue/css/template_css.css
/resources/books/index.asp
/PCL-740-rackmount-computer-8011.html
...etc.! :eek:

Any ideas what's going on here? In some ways it's as if the spider can read the non-static urls rather than the rewritten ones - I didn't think that was possible...is it? It's a brand new site so there's no old pages linking to these non-existant urls etc if that helps!

It's from the ip range 66.196.91.xxx which I presume is the Slurp range, but I've not double checked.

:confused:

Redleg
Oct 28th 2004, 12:34 am
Have you just registered the domain name??

If you have, then it's possible that it's been used in the past, and Yahoo is crawling old links..

Have you checked on www.archive.org ??

aleco
Oct 29th 2004, 4:41 am
Thanks for the reply and suggestion - I've just checked that site now but it's not come up with any results for my new domain! :confused:

Any other possibilities? Has no-one else noticed this on their sites?