Bots that you disallow because they eat bandwidth

Discussion in 'robots.txt' started by websiteideas, Mar 3, 2006.

  1. #1
    Are there any bots that you disallow because all they do is eat up your bandwidth and deliver you no visitors? If so, which ones do you disallow and what does your robots.txt file look like? If you would, please post the code for your robots.txt file here.
     
    websiteideas, Mar 3, 2006 IP
  2. mjamesb

    mjamesb Member

    Messages:
    88
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    48
    #2
    I stop a few bots...one for example is
    User-agent: linksmanager_bot
    I did link exchanges with some sites that use this service. Sorry but a few links isn't worth the stupid spider hitting my site a thousand times in a week.
     
    mjamesb, Mar 4, 2006 IP
  3. websiteideas

    websiteideas Well-Known Member

    Messages:
    1,406
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    130
    #3
    I'm surprised that bot would hit your entire site? What's the purpose? It's trying verify that you have reciprocated the link?
     
    websiteideas, Mar 4, 2006 IP
  4. tpn87

    tpn87 Well-Known Member

    Messages:
    522
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    100
    #4
    I only let the major engines in, the ones that are going to provide traffic to my site.


    User-agent: Googlebot
    Disallow:
    User-agent: MSNBot
    Disallow:
    User-agent: Inktomi Slurp
    Disallow:
    User-agent: Slurp
    Disallow:
    User-agent: Teoma
    Disallow:
    User-agent: *
    Disallow: /
     
    tpn87, Mar 13, 2006 IP
  5. MatthewN

    MatthewN Well-Known Member

    Messages:
    859
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    195
    #5
    None at the moment. I run a dedicated server and have plenty of bandwith at the moment though, so no worries there.
     
    MatthewN, Mar 13, 2006 IP
  6. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #6
    1. that's sort of a self-fulfilling prophecy, isn't it? sicne you disallow other SE bots, how are they ever going to provide you with traffic in the future? a rather short-sighted solution

    2. the major problem bots aren't going to pay any attention to your robots.txt file - what you've just done is turn away some well-behaved bots to make more room for the badly behaved ones.

    In general, I don't think you gain anything by banning bots in robots.txt - you really should only be using it to disallow indexing of certain folders or files.
     
    minstrel, Mar 19, 2006 IP
    RectangleMan likes this.
  7. tradealoan

    tradealoan Peon

    Messages:
    49
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    yes allow the major search engines only as they are the one who delivers you the traffic.
     
    tradealoan, Mar 22, 2006 IP
  8. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #8
    That's bad advice. Re-read my previous post.
     
    minstrel, Mar 22, 2006 IP
  9. websiteideas

    websiteideas Well-Known Member

    Messages:
    1,406
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    130
    #9
    I've found that the following three lines in my robots.txt file can save my server from getting loaded down by this bot that doesn't ever bring me any traffic.

    User-agent: BecomeBot
    Crawl-Delay: 30
    Disallow: /cgi-bin

    Anyone else notice this bot just eats their bandwidth and doesn't ever bring them traffic?
     
    websiteideas, Mar 30, 2006 IP
  10. RectangleMan

    RectangleMan Notable Member

    Messages:
    2,825
    Likes Received:
    132
    Best Answers:
    0
    Trophy Points:
    210
    #10
    BecomeBot is the worst...
     
    RectangleMan, Apr 6, 2006 IP
  11. tpn87

    tpn87 Well-Known Member

    Messages:
    522
    Likes Received:
    26
    Best Answers:
    0
    Trophy Points:
    100
    #11

    Bandwidth eater....
     
    tpn87, Apr 7, 2006 IP
  12. websiteideas

    websiteideas Well-Known Member

    Messages:
    1,406
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    130
    #12
    Does anyone find that allowing this bot actually brings them more traffic?
     
    websiteideas, Apr 14, 2006 IP
  13. websiteideas

    websiteideas Well-Known Member

    Messages:
    1,406
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    130
    #13
    Here's a few more lines you might want to add:


    User-agent: WebStripper
    Disallow: /

    User-agent: WebCopier
    Disallow: /

    User-agent: Offline Explorer
    Disallow: /
     
    websiteideas, Apr 26, 2006 IP
  14. tbfilly

    tbfilly Peon

    Messages:
    12
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Does anyone know anything about the Alexa bot? Has shown up to crawl my site, Knowledge Creates Power. I'll be on the lookout for some of the ones Everyone has mentioned, Thanks!
    Cheers
     
    tbfilly, Jun 16, 2006 IP
  15. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #15
    What are you asking? What the Alexa bot is? or if you should ban it?

    I don't understand the rush to ban bots. First, on an average site (let's face it - most of the member's sites here are not Microsoft or even Webmasterworld), how much of a problem can it be. In North America, at least, bandwidth these days is pretty cheap. And most of the suggested bots banning robots.txt files I've seen ban bots that simply should not be banned (not long ago, someone asked me to look at his site to find out why Google wasn't indexing it - pre-Big Daddy - and it turned out he'd been banning user agents and inadvertently included Googlebot and Slurp and probably MSNbot).
     
    minstrel, Jun 16, 2006 IP
  16. websiteideas

    websiteideas Well-Known Member

    Messages:
    1,406
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    130
    #16
    The problems usually arise when you have a huge site with thousands of pages like an Amazon clone or a huge article directory. For most people, you're right - banning bots is not something they need to rush to do.
     
    websiteideas, Jul 15, 2006 IP