I have noticed the last few weeks that ths bot is searching my site http://www.cuill.com/twiceler/robot.html but the link goes to a dead end there is no information to it any ideas?
just read something similar in some threads before. i guess lots of new bots are appearing recently..
That bot used to hammer my server spidering 100's of pages at a time. It obeys the robots.txt file but it often doesn't start with this file so banning it that way is useless. I just block the IP. This works but it still skews the visitor statistics. More and more bots! Just block the IP of the offending bot in your website control panel or .htacess file.
The spider may take a week before it obeys any changes in robots.txt to restrict/block it. It does obey it but keeps the robots.txt in cache for seven days from what I've been told. Rather then mess about with your .htaccess files to block the IPs the robot is coming from, you can put in the blocks in your robots.txt file and see the changes in seven days. Alternatively, email and ask him to block the spidering of your site. He responded fast to my email and was very easy to deal with. Hope that helps.
I banned the twiceler bot. It was sucking FAR too much bandwidth for my liking and I could not figure out why it was looking... Is it actually related to a search engine?
Omg u banned a bot, now all the bots will take revenge for sure for banning their brother bot. But how much bw was it using?
I don't remember how much it was actually using, but it was more then MSN & YAHOO combined (which isn't that much either really). I just don't like bots that I can not identify, or that don't seem to be attached to anything I can recognize. I would not mind unbanning it, if I could figure out just what it was doing. I even did a search (google) on it and didn't come up with anything constructive. so *shrug*
New robot = new search engine = increased traffic And you guys say ban him??? I think I might be lost ... or dreaming. Did you just say you ban a bot who is more aggressive than googlebot? MSN & Y! bots are terrible, what good is getting 50 - 500 pages a month. That is just the tip of the iceberg for most of my sites. Give me a bot that slurps down pages and pages and satisfies my lust to be read .. in my entirely! I'll even use the engine when released, especially if they give me access to that big ass cache of the internet so I can program around it. I bet they already have 5x the amount of pages that Y! & MSN have combined. Don't worry, bot bandwidth is cheap. A couple 1000 pages are only 20MB in my logs, that's cuill [pronounced cool] with me.
I am not sure what this bot can give to my sites, because the site seem dead and it really cause my dedicated server down from time to time... how can I block this robot from my entire IP?
If you had bothered to actually read this thread, you would have found that emailing them results in a quick stop to that bot visiting your site. Try it.
I blocked the bot's IP with .htaccess a few weeks ago. Its back now with different IP and doesn't seem to follow robots TXT. It constantly hits the members profiles for some reason but it does crawl regular pages. What is this bot after? I'm gonna let it ride a little while before I block it again.
I blocked it in a such way: ## USER IP BANNING <Limit GET POST> Order Allow,Deny Deny from 94.127.144.38 allow from all </Limit>
I don't know what all of the fuss here is about; Twiceler is a legitimate parsing agent for the Cuil (clustered) Search Engine. It is so legitimate in fact, that just last month, I wrote them into the configuration settings on my robots.txt generator tool; http://www.webshoppesolutions.com/bottxt_generator.htm As far as search results go, Cuil offers no better or worse results than does Bing IMO .. and no one ever gives a second thought to Bing. Cuil uses the same parsing agent with the same name consistantly. Bing, does not. Sometimes Microsoft comes in to the domain with absolutely no useragent ID at all, and if it does, it uses all other kinds of junk LIBWWW types of agents, besides the regular MSNBOT ID. Don't worry about Cuil .. they'll be just fine. Start-ups will often sputter and do hit and miss with their search results .. it's to be expected.