What bots ignore robots.txt?

Discussion in 'robots.txt' started by Jon12345, Aug 16, 2005.

  1. #1
    Ok, so there are various bots and crawlers parading around the internet. I am using this...

    User-agent: *
    Disallow: red.php

    ...in my robots.txt file. I presume this is correct if I don't want any bots to go to follow to the red.php page. yes?

    But what percentage of bots actually ignore such a request? Any idea?

    Also, should I use a no-follow tag instead?

    Thanks,

    Jon
     
    Jon12345, Aug 16, 2005 IP
  2. Willy

    Willy Peon

    Messages:
    281
    Likes Received:
    25
    Best Answers:
    0
    Trophy Points:
    0
    #2
    All reputable, major bots honor robots.txt. If a crawler doesn't honor it, it's likely to ignore no-follow as well, so I don't think you need to bother about that.

    You can setup a spider trap to catch bad-mannered bots: http://www.fleiner.com/bots/
     
    Willy, Aug 16, 2005 IP
  3. lorien1973

    lorien1973 Notable Member

    Messages:
    12,206
    Likes Received:
    601
    Best Answers:
    0
    Trophy Points:
    260
    #3
    I think askjeeves ignores robots.txt, but I'm not sure.
     
    lorien1973, Aug 16, 2005 IP
  4. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #4
    No. They claim they honor it:

    http://sp.ask.com/docs/about/aj/teoma.htm#6

     
    minstrel, Oct 15, 2005 IP