1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

http://www.cuill.com/twiceler/robot.html whats this bot?

Discussion in 'All Other Search Engines' started by Audiomad, Jun 18, 2007.

  1. #1
    Audiomad, Jun 18, 2007 IP
    SEMrush
  2. SticKer

    SticKer Well-Known Member

    Messages:
    2,394
    Likes Received:
    78
    Best Answers:
    0
    Trophy Points:
    115
    #2
    just read something similar in some threads before. i guess lots of new bots are appearing recently..
     
    SticKer, Jun 18, 2007 IP
  3. Audiomad

    Audiomad Peon

    Messages:
    1,029
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I suppose but should I block it?
     
    Audiomad, Jun 18, 2007 IP
  4. tsanko

    tsanko Peon

    Messages:
    361
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Write to the owners of the bot. If they don`t answer you in 2-3 days, ban it!
     
    tsanko, Jun 18, 2007 IP
  5. 2KTown

    2KTown Peon

    Messages:
    80
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #5
    That bot used to hammer my server spidering 100's of pages at a time. It obeys the robots.txt file but it often doesn't start with this file so banning it that way is useless. I just block the IP. This works but it still skews the visitor statistics. More and more bots! Just block the IP of the offending bot in your website control panel or .htacess file.
     
    2KTown, Jun 18, 2007 IP
  6. Claudek

    Claudek Well-Known Member

    Messages:
    1,379
    Likes Received:
    81
    Best Answers:
    0
    Trophy Points:
    165
    #6
    The spider may take a week before it obeys any changes in robots.txt to restrict/block it. It does obey it but keeps the robots.txt in cache for seven days from what I've been told.

    Rather then mess about with your .htaccess files to block the IPs the robot is coming from, you can put in the blocks in your robots.txt file and see the changes in seven days.

    Alternatively, email
    and ask him to block the spidering of your site. He responded fast to my email and was very easy to deal with.

    Hope that helps.
     
    Claudek, Jun 18, 2007 IP
  7. Qryztufre

    Qryztufre Prominent Member

    Messages:
    6,072
    Likes Received:
    491
    Best Answers:
    0
    Trophy Points:
    300
    #7
    I banned the twiceler bot. It was sucking FAR too much bandwidth for my liking and I could not figure out why it was looking...

    Is it actually related to a search engine?
     
    Qryztufre, Jun 18, 2007 IP
  8. paidhosting

    paidhosting Peon

    Messages:
    4,823
    Likes Received:
    483
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Omg u banned a bot, now all the bots will take revenge for sure for banning their brother bot.

    But how much bw was it using?
     
    paidhosting, Jun 18, 2007 IP
  9. Qryztufre

    Qryztufre Prominent Member

    Messages:
    6,072
    Likes Received:
    491
    Best Answers:
    0
    Trophy Points:
    300
    #9
    I don't remember how much it was actually using, but it was more then MSN & YAHOO combined (which isn't that much either really).

    I just don't like bots that I can not identify, or that don't seem to be attached to anything I can recognize. I would not mind unbanning it, if I could figure out just what it was doing.

    I even did a search (google) on it and didn't come up with anything constructive. so *shrug*
     
    Qryztufre, Jun 18, 2007 IP
  10. CboY

    CboY Peon

    Messages:
    25
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    He is right, follow the steps and the problem disappears.
     
    CboY, Jun 18, 2007 IP
  11. ErectADirectory

    ErectADirectory Guest

    Messages:
    656
    Likes Received:
    65
    Best Answers:
    0
    Trophy Points:
    0
    #11
    New robot = new search engine = increased traffic

    And you guys say ban him??? I think I might be lost ... or dreaming. Did you just say you ban a bot who is more aggressive than googlebot?

    MSN & Y! bots are terrible, what good is getting 50 - 500 pages a month. That is just the tip of the iceberg for most of my sites. Give me a bot that slurps down pages and pages and satisfies my lust to be read .. in my entirely!

    I'll even use the engine when released, especially if they give me access to that big ass cache of the internet so I can program around it. I bet they already have 5x the amount of pages that Y! & MSN have combined.

    Don't worry, bot bandwidth is cheap. A couple 1000 pages are only 20MB in my logs, that's cuill [pronounced cool] with me.
     
    ErectADirectory, Aug 30, 2007 IP
    Qryztufre likes this.
  12. buymp3players

    buymp3players Peon

    Messages:
    11
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #12
    this SE is split from google and a startup in CA
     
    buymp3players, Aug 31, 2007 IP
  13. PinoyIto

    PinoyIto Notable Member

    Messages:
    5,863
    Likes Received:
    170
    Best Answers:
    0
    Trophy Points:
    260
    #13
    I am not sure what this bot can give to my sites, because the site seem dead and it really cause my dedicated server down from time to time...

    how can I block this robot from my entire IP?
     
    PinoyIto, Sep 4, 2007 IP
  14. Claudek

    Claudek Well-Known Member

    Messages:
    1,379
    Likes Received:
    81
    Best Answers:
    0
    Trophy Points:
    165
    #14
    If you had bothered to actually read this thread, you would have found that emailing them results in a quick stop to that bot visiting your site.

    Try it.

     
    Claudek, Sep 4, 2007 IP
  15. muppethunter

    muppethunter Guest

    Messages:
    123
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #15
    I blocked the bot's IP with .htaccess a few weeks ago. Its back now with different IP and doesn't seem to follow robots TXT. It constantly hits the members profiles for some reason but it does crawl regular pages. What is this bot after? I'm gonna let it ride a little while before I block it again.
     
    muppethunter, Sep 9, 2007 IP
  16. sveha

    sveha Greenhorn

    Messages:
    11
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #16
    I blocked it in a such way:

    ## USER IP BANNING
    <Limit GET POST>
    Order Allow,Deny
    Deny from 94.127.144.38
    allow from all
    </Limit>
     
    sveha, Nov 10, 2009 IP
  17. WebshoppeSolutions

    WebshoppeSolutions Peon

    Messages:
    139
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #17
    I don't know what all of the fuss here is about;

    Twiceler is a legitimate parsing agent for the Cuil (clustered) Search Engine.

    It is so legitimate in fact, that just last month, I wrote them into the configuration settings on my robots.txt generator tool; http://www.webshoppesolutions.com/bottxt_generator.htm

    As far as search results go, Cuil offers no better or worse results than does Bing IMO .. and no one ever gives a second thought to Bing.

    Cuil uses the same parsing agent with the same name consistantly. Bing, does not. Sometimes Microsoft comes in to the domain with absolutely no useragent ID at all, and if it does, it uses all other kinds of junk LIBWWW types of agents, besides the regular MSNBOT ID.

    Don't worry about Cuil .. they'll be just fine. Start-ups will often sputter and do hit and miss with their search results .. it's to be expected.
     
    WebshoppeSolutions, Nov 10, 2009 IP