I've built some of my own stats software. Now my site is quiet niche, no visitor would ever ever ever come to my site every day for a whole month. I'm seeing some odd behavior. Take last month as a example. 65.36.241.75 - 30 days, 707 page hits total. Only ever hit my index page. wfp2.almaden.ibm.com - 30 days, 30 hits total. Only ever hit my index page. 66.155.231.209 - 22 days, 44 hits total. Only ever hit my index page. 66-194-6-84.gen.twtelecom.net - 16 days, 17 hits total. Only ever hit my index page. All the above ips are in the US, I'm in the UK. Who are they, why are they coming to my site? Why should I let them continue if they don't serve a purpose, I assume they're wasting bandwidth. Any advice or suggestions?
I can't say for sure but they look like spoofed IP's to me, for example, the subdomain; wfp2.almaden.ibm.com. I don't know the purpose. Perhaps someone else can shed some light?
The alamden one is a legit bot that indexes information for businesses and intranets. The others are ISPs, I think - not bots but visitors? Several different visitors? 65.36.241.75 66.155.231.209 66.194.6.84
Maybe not what they were looking for? My girlfriend uses AOL. When she starts it up, it loads a portal page that features certain hot news items and sites. It's possible (guessing here because I don't use those ISPs if that's what they are) that susbscribers are clicking on a link, getting to the home page, and deciding it's not what they're interested in pursuing?
Ok this is perhaps shedding more light (?) 65.36.241.75 - - [04/Oct/2005:00:51:05 +0100] "HEAD / HTTP/1.1" 200 0 "-" "InternetSeer.com" 66.155.231.209 - - [04/Oct/2005:04:54:01 +0100] "GET /robots.txt HTTP/1.1" 302 209 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; MSIECrawler)" 66.155.231.209 - - [04/Oct/2005:04:54:02 +0100] "GET / HTTP/1.1" 200 21047 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; MSIECrawler)" Cant find the last one in my logs... Internetseer? MSIEbot? Smells dodgy.... <edit> Wow msiebot is when someone adds to their favourites, awsome! http://www.webmasterworld.com/forum11/2360.htm
This is a website uptime / downtime checker - you must have subscribed to the free service at one time.
will this hurt me getting indexed if I use this in my robots.txt: User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) Disallow: / Because that bot is killing my bandwith.
I don't know... all I know is that this agent: "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)" has been killing my sites bandwith the last couple of days. Is there anyway to block it without blocking visitors or msn,G, or Y! bots? Fairly new to robots.txt related issues so bare with me.
But how do you know it's a bot? or any single "visitor"? It may be several human visitors using a version of MSIE, no?
cause it's under one host: Host: 68.58.242.24 Should I just block that I.P? If so, how would I do so?
http://www.whois.sc/68.58.242.24 It's an ISP, not a bot. What kind of numbers are you getting from there and what files are being requested? If the number of hits is extraordinary, you might want to contact the ISP and ask them what's going on with one or more of their subscribers.
they've requested thousands of basic content website pages today... Should I block them? Is that robots.txt thing I did above valid or will it hurt me? Thanks for your help.
Are you on a Linux/Unix server running Apache? Can you create or edit an .htacess file? If so, add these lines to the top of the .htaccess file: <Limit GET POST> order allow,deny deny from 68.58.242.24 allow from all </Limit> Code (markup): But I'd still suggest you contact Comcast and alert them that something is up and going through their IP address.
If they are killing your bandwidth they will probably not obay robots.txt. just reject the IP address, that's what I do no single user will get 1000s of pages from my site in a couple of hours, and if it cannot identify itself as a nice robot (Googlebot et al) I ignore that IP address
yeah... i have a htaccess in use right now cause I have a mod_rewrite going. How can I "reject"/block the IP address? This is what I've been dealing with: /directory-forclosure-AR.html Http Code: 200 Date: Nov 01 20:44:17 Http Version: HTTP/1.1 Size in Bytes: 23387 Referer: - Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) | | | /directory-forclosure-AK.html Http Code: 200 Date: Nov 01 20:44:19 Http Version: HTTP/1.1 Size in Bytes: 23282 Referer: - Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) | | | /directory-apartment-VA.html Http Code: 200 Date: Nov 01 20:44:20 Http Version: HTTP/1.1 Size in Bytes: 26761 Referer: - Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) | | | /directory-apartment-UT.html Http Code: 200 Date: Nov 01 20:44:21 Http Version: HTTP/1.1 Size in Bytes: 26649 Referer: - Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) | | | /directory-apartment-TX.html Http Code: 200 Date: Nov 01 20:44:22 Http Version: HTTP/1.1 Size in Bytes: 23073 Referer: - Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) | | | /directory-apartment-TN.html Http Code: 200 Date: Nov 01 20:44:25 Http Version: HTTP/1.1 Size in Bytes: 23179 Referer: - Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) | | | /city-carloan-IL-Waukegan.html Http Code: 200 Date: Nov 01 20:44:25 Http Version: HTTP/1.1 Size in Bytes: 29412 Referer: - Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) | | | /directory-apartment-SC.html Http Code: 200 Date: Nov 01 20:44:27 Http Version: HTTP/1.1 Size in Bytes: 23309 Referer: - Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) | | | /directory-apartment-RI.html Http Code: 200 Date: Nov 01 20:44:29 Http Version: HTTP/1.1 Size in Bytes: 26859 Referer: - Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) But there is thousands of them just today!