1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Twiceler wont stop even when banned from robots.txt

Discussion in 'robots.txt' started by deluxdon, Jun 24, 2007.

  1. prepress forum

    prepress forum Guest

    Messages:
    11
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #21
    Sure, I read it and emailed the abuser and they finally stopped after I jack around with it forever. I do not see how this abuse of internet traffic is so benign. Where in the rules of acceptable use of robots.txt does it say that a content harvesting abuser of my site should keep abusing for 7 days after robots.txt banns him.

    I also do not know how this could be considered anything but a web abuser bad spider when they won't stop at an apache deny block, they swich IPs, IDs, and go right around it. Why should I have to email the abuser. Why is he blowing by my security with no respect at all. This auto hacking spider needs to be shut down IMO.

    What legitimate use is it for that spider to abuse all our sites and bandwidth so badly? What is this guy doing with our content that is sucking more bandwidth than the real search engines? It says it's an "experimental crawler" what are they experimenting with, a new way to abuse all the webmasters for no purpose. They have no search capability to give us traffic, so why do they have 22 IP addresses hammering the crap out of our servers for no benefit to us.
     
    prepress forum, Dec 15, 2007 IP
  2. stevenh

    stevenh Peon

    Messages:
    72
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #22
    According to cuil who by the way, went live today. There have been imposter spiders acting as if they are twiceler and they posted the IP's that cuil's legit spiders will be coming from. You can choose to deny access to these IP's to keep from being spidered, but I'd guess that they are finished with their robots.txt ignoring scrape of the entire web. They've completed the goal of spidering more pages (780 Billion) than anyone else including Google, they claim.

    38.99.13.121 38.99.44.101 64.1.215.166 208.36.144.6
    38.99.13.122 38.99.44.102 64.1.215.162 208.36.144.7
    38.99.13.123 38.99.44.103 64.1.215.163 208.36.144.8
    38.99.13.124 38.99.44.104 64.1.215.164 208.36.144.9
    38.99.13.125 38.99.44.105 64.1.215.165 208.36.144.10
    38.99.13.126 38.99.44.106

    Cheers
     
    stevenh, Jul 28, 2008 IP
  3. seoorganizers

    seoorganizers Peon

    Messages:
    111
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #23
    thanks for solutions, will mail them to stop them, thanks for saving my time
     
    seoorganizers, Aug 11, 2008 IP
  4. seoorganizers

    seoorganizers Peon

    Messages:
    111
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #24
    hi i got response from them, they have changed their frequency of visiting my site to one week,
     
    seoorganizers, Aug 13, 2008 IP
  5. elladrone

    elladrone Peon

    Messages:
    116
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #25
    tnx for the IP list, I needed this to keep them out as well....
     
    elladrone, Dec 28, 2009 IP