1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

unknown bots crawling? how are they and what do they want?

Discussion in 'robots.txt' started by tb1234, Jun 7, 2006.

  1. #1
    Is there anyone who is facing a problem tht bluddy ghosty bots crawling the site & eating up your bandwidth.. i got some entries in my log like...

    Unknown robot (identified by 'spider')4070 Pages Crawled - 106 MB Crawled

    Unknown robot (identified by 'crawl')1806 Pages Crawled - 50 MB Crawled

    So is there someone has any idea on tht...
     
    tb1234, Jun 7, 2006 IP
  2. weppos

    weppos Well-Known Member

    Messages:
    99
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    125
    #2
    It looks like an Awstats report.
    It's impossibile to gather more information about these spiders without checking their single user agents.
     
    weppos, Jul 1, 2006 IP
  3. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #3
    If this data comes from AWStats, you also get the date and time of the last visit of these robots. You can use that information to check in your log file. Of course, "Unknown robot (identified by 'spider')" could be several different robots.

    Jean-Luc
     
    Jean-Luc, Jul 1, 2006 IP
  4. Carl29

    Carl29 Active Member

    Messages:
    114
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    51
    #4
    Carl29, Apr 29, 2011 IP
  5. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #5
    These might be some spam bots, if they are eating much of your bandwidth then just check their originated IP range and block the same using htaccess.
     
    manish.chauhan, May 3, 2011 IP