Unknown robot eating up bandwidth

Discussion in 'Traffic Analysis' started by seowebmaster, Sep 1, 2009.

  1. #1
    Hello,

    I am watching my AWStats data and following are eating up my bandwidth heavily. What are those? Can I control them eating up unnecessary bandwidth?

    Unknown robot (identified by empty user agent string)
    Unknown robot (identified by 'robot')
    Unknown robot (identified by 'spider')

    Thanks in advance for any help.
     
    seowebmaster, Sep 1, 2009 IP
  2. wilderness

    wilderness Member

    Messages:
    43
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    43
    #2
    "(identified by empty user agent string)"

    This is the first criteria in which every webmaster should deny visitors.
     
    wilderness, Sep 2, 2009 IP
  3. seowebmaster

    seowebmaster Peon

    Messages:
    2
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Thanks for your reply. But would you like to explain me in detail?
     
    seowebmaster, Sep 2, 2009 IP
  4. whitefire

    whitefire Peon

    Messages:
    44
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    you need to make a htaccess rule. google is your friend.
     
    whitefire, Sep 4, 2009 IP
  5. wilderness

    wilderness Member

    Messages:
    43
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    43
    #5
    First, in this instance (and most others) AWStats provided by a host and giving collective group stats are worthless.
    You need to review you visitor logs and gather the IP range for this bot.

    As "whitefire" has previously suggested, htaccess provides access control to your websites (AKA firewall (at least for lack of a better word)).
    It should be noted that there are no across-the-board configurations for access by visitors, rather each webmaster must determine what is beneficial or detrimental to their own site (s).

    There are currently multiple methods being utilized by webmasters;
    1) black-listing (denying access to known offenders)
    2) white-listing (denying access to ALL and allowing only those visitors (quite detailed) you desire.

    In addition, anybody should keep in mind (at least before you run off copying and pasting masses of code in to your htaccess) the following:

    1) a single syntax error could generate a 500 Error and prevent you entire site (s) from functioning.

    2) many people are clueless when it comes to creating effective code for htaccess (even many webhosts who work with Apache daily), although the cluelessness doesn't prevent them from placing their lame-code on web pages and passing on to others.

    3) Over time there have been Apache updates, and in addition, modules of Apache may or may not be installed consistently (or in the same configuration order) across different hosts.
    As a result some htaccess examples from older years (and on long active web pages) may not function as intended today.
     
    wilderness, Sep 8, 2009 IP