1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Help! Yahoo crawler is sucking my bandwidth - 8,779 queries in last 3 days!

Discussion in 'robots.txt' started by electroze, Jul 20, 2011.

  1. #1
    Yahoo is just pummeling my site, causing my bandwidth to be throttled by my host, which can scare normal visitors away. I noticed a problem before and posted about it (www.webmastertools.bz), but decided to measure it again and see if declined, it hasn't. It queries the same pages over and over too, ex. it queried 'A' 180 times.

    Does anyone know if Yahoo's bot 67.195.112.38 is supposed to obey this robots.txt command?

    User-agent: Slurp
    Crawl-delay: 5

    It's not slowing it down one bit. The Bing site said do crawl delay from 1 - 10, 10 meaning extremely slow. Another user said it's actually minutes, so you might want that number in the hundreds.

    Anyone know:
    1. If Yahoo obeys that command, if not what the command is?
    2. If so, is the # in minutes or from 1 - 10?

    Thanks!
     
    electroze, Jul 20, 2011 IP
  2. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #2
    @electroze, a Crawl-delay parameter is set to the number of seconds to wait between successive requests to the same server, And since you have set it 5 in your instructions, it doesn't instruct Yahoo crawler to delay for a lot of time. Increase this number to 300 or more so that the cralwer should wait at leat 5 minutes before makeing a request again. This would restrict crawler to make unlimited requests and you can limit the crawler. Hope this would help you out.
     
    manish.chauhan, Aug 1, 2011 IP