List of Bad IP / Scrapers / Content stealers!

Discussion in 'All Other Search Engines' started by Ballz, Aug 13, 2006.

  1. #1
    Hi all,

    My bandwith was being guzzled by one scrapper. And after checking my logs deeply i realized there were a few others...

    So here is the BIG idea, why dont all of us here at DP start a listing of bad ip's which should be blocked??

    here's something to start with

    Deny this ip belonging to McColo, as its a scraper...

    208.66.195.


    At the end of every week, I will collate the data and add it all up in one thread.. and keep moving from there... I hope everyone supports this venture.
     
    Ballz, Aug 13, 2006 IP
  2. FOX LORE

    FOX LORE Notable Member

    Messages:
    8,118
    Likes Received:
    408
    Best Answers:
    0
    Trophy Points:
    230
    #2
    Hey, thats a good idea--these 2 belong to the same person:
    66.77.136.123
    62.194.10.83
     
    FOX LORE, Aug 13, 2006 IP
  3. ian_ok

    ian_ok Peon

    Messages:
    551
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #3
    This isn't really a SE related question...

    A good idea but what you consider a bad bot others may not.

    I check all the suspicious IP's with dnsstuff and google search and then ban if I feel I'm right in doing so.

    Ian
     
    ian_ok, Aug 13, 2006 IP
  4. Ballz

    Ballz Well-Known Member

    Messages:
    649
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    125
    #4
    Deny this ip belonging to McColo, as its a scraper...
    208.66.195.

    Submitted by Fox LORE
    66.77.136.123
    62.194.10.83
     
    Ballz, Sep 16, 2006 IP
  5. Enrio

    Enrio Peon

    Messages:
    87
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Be careful is a dangerous way... it's very easy to abuse!
     
    Enrio, Sep 16, 2006 IP
  6. amnezia

    amnezia Peon

    Messages:
    990
    Likes Received:
    31
    Best Answers:
    0
    Trophy Points:
    0
    #6
    38.99.203.110
     
    amnezia, Sep 16, 2006 IP
  7. sachin410

    sachin410 Illustrious Member

    Messages:
    6,422
    Likes Received:
    573
    Best Answers:
    0
    Trophy Points:
    410
    #7
    62.194.1.235 belongs to same group.
     
    sachin410, Sep 16, 2006 IP
  8. clancey

    clancey Peon

    Messages:
    1,099
    Likes Received:
    63
    Best Answers:
    0
    Trophy Points:
    0
    #8
    These projects are always interesting. Unfortunately, they hurt the innocent.

    Most people do not know the difference between a bad and benign visitor. It is impossible to know someone is scraping your site just because they have visited all pages. Consider the plight of new search engines; people with extremely short attention spans; and those with increadibly fast fingers and minds. The net result is the list is bound to contain IPs which do not deserve to be blacklisted.

    A more fundamental problem is that your vistor could have a dynamic IP rather than a static IP. The next person who gets the IP is banned from your site for no reason. Meanwhile the scraper returns from another IP address.

    If I were doing this I would use some form of honey pot to determine the difference between good and bad robots and their IP addresses. I would time limit the ban to take into account dynamic IPs.

    This should be done in real time by a daemon. Banning throw away IP addresses after the fact is pointless.

    Another technique used is to limit the number of pages people can visit per minute and assume anyone viewing more pages per minute is a robot. If your pages are picture rich this will catch robots which ignore pictures and those with the temerity to view your site with Lynx.

    An alternative to banning, of course, is to require users to log in to view your best content.
     
    clancey, Sep 16, 2006 IP
  9. fatinfo guy

    fatinfo guy Peon

    Messages:
    586
    Likes Received:
    34
    Best Answers:
    0
    Trophy Points:
    0
    #9
    fatinfo guy, Oct 2, 2006 IP
  10. surchin

    surchin Banned

    Messages:
    372
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    0
    #10
    wrong section for this thread but interesting nonetheless.
     
    surchin, Oct 2, 2006 IP