1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Blocking robots?

Discussion in 'robots.txt' started by l234244, May 30, 2005.

  1. #1
    Does anyone have a good list of email harvesting and other bad bots that should be blocked in their robot.txt file?
     
    l234244, May 30, 2005 IP
  2. crazyhorse

    crazyhorse Peon

    Messages:
    1,137
    Likes Received:
    19
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Dont think you are able to block them by using a robot.txt file. Do you check/read your log files? If so Look up there IP and bann the harvesters. That will help at first but most of times they use multiple IPs so you will have to bann them all.
     
    crazyhorse, May 30, 2005 IP
  3. l234244

    l234244 Peon

    Messages:
    1,225
    Likes Received:
    50
    Best Answers:
    0
    Trophy Points:
    0
    #3
    There are crap loads to look through, was just wondering if anyone had a set of standards used for every site
     
    l234244, May 30, 2005 IP
  4. crazyhorse

    crazyhorse Peon

    Messages:
    1,137
    Likes Received:
    19
    Best Answers:
    0
    Trophy Points:
    0
    #4
    You mean something like this :
     
    crazyhorse, May 30, 2005 IP
    l234244 likes this.
  5. TechEvangelist

    TechEvangelist Guest

    Messages:
    919
    Likes Received:
    140
    Best Answers:
    0
    Trophy Points:
    133
    #5
    That list won't do anything unless the spider recognizes and adheres to the robots.txt rules, which I susppect most e-mail harvesters ignore.

    robots.txt is just a text file that needs to be read by the spider. It is a voluntary method for excluding spiders.

    crazyhorse is right. You need to block them using their IP addresses.
     
    TechEvangelist, May 30, 2005 IP
  6. TommyD

    TommyD Peon

    Messages:
    1,397
    Likes Received:
    76
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Kinda still on topic...

    Why would you block anything from Micro$oft?

    thx,

    tom
     
    TommyD, May 30, 2005 IP
  7. crazyhorse

    crazyhorse Peon

    Messages:
    1,137
    Likes Received:
    19
    Best Answers:
    0
    Trophy Points:
    0
    #7
    To save bandwith :D The above microsoft listing isnt the normal msn user agent. It uses a different name,
     
    crazyhorse, May 30, 2005 IP
  8. l234244

    l234244 Peon

    Messages:
    1,225
    Likes Received:
    50
    Best Answers:
    0
    Trophy Points:
    0
    #8
    As long as i stop a certain majority of bots then I'm happy :)
     
    l234244, May 30, 2005 IP
  9. TommyD

    TommyD Peon

    Messages:
    1,397
    Likes Received:
    76
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Just to ponder on an idea, rather than intentionally blocking some robots, how about block all robots, and only allow the few major bots through?

    thx,

    tom
     
    TommyD, May 30, 2005 IP
  10. Will.Spencer

    Will.Spencer NetBuilder

    Messages:
    14,789
    Likes Received:
    1,040
    Best Answers:
    0
    Trophy Points:
    375
    #10
    Will.Spencer, May 30, 2005 IP
    Crazy_Rob likes this.