I've got the same on my forum. My site is private, google (and the public) can only see a couple of pages but it's been on site for over 9 hours now and it's still there. Jane
i dont believe this, the googlebot is back in my forums, after over 9 hours today why would it leave and then come back again?
Trust me it not a big deal with a forum Only worry if you do not get them Have a look at who is online on this forum
Sam go to the menu bar on Digital Point and click on quick links, then click on who's online and go through all the pages you see. You will find that Google and Yahoo and a lot of other bots live here at Digital Point all the time. I have tried to get Shawn to kill them but for some reason he likes the spiders and they like him. Go figure Sam
MSIE 6.0; Windows 98; YComp 5.0.8.6; yplus 1.0)" 66.249.66.133 - - [22/Dec/2004:12:35:20 -0800] "GET /v-web/bulletin/bb/index.php?sid=7bab24f6d2c0f8eb5f0739b7a3be08cc HTTP/1.1" 200 31769 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.140.226.193 - - ha ha googlebot uses windows 98 lol
There is a worm going around that defaces message boards. G crawled and cached lots of these defaced pages which have since been fixed. They probably dumped any page from their cache that had a forum-like URL and are re-indexing them all to get clean copies. Just a theory of course. If its a problem for you, you can email G and ask them to rate-limit their crawls of your site.
Guest Fri Dec 2004 11:55 am Fri Dec 2004 11:55 am Viewing FAQ 66.249.66.133 Guest Fri Dec 2004 11:54 am Fri Dec 2004 11:54 am Viewing FAQ 66.249.66.133 Guest Fri Dec 2004 11:54 am Fri Dec 2004 11:54 am Viewing FAQ 66.249.66.133 Guest Fri Dec 2004 11:53 am Fri Dec 2004 11:53 am Viewing FAQ 66.249.66.133 Guest Fri Dec 2004 11:53 am Fri Dec 2004 11:53 am Viewing FAQ 66.249.66.133 Guest Fri Dec 2004 11:52 am Fri Dec 2004 11:52 am Viewing FAQ 66.249.66.133 ok 3 days is to much to be indexing a bb FAQ page and its still doing it, how do i make a robot.txt to stop it going to the junk pages? and get it to read the topics? it aint looked at one topic in 3 days, it just sits there looking at ???? crap. it keep trying to logon it to.
Using a text editor (like notepad), create a file named robots.txt and add these lines: User-agent: * Disallow: faq.php Add other Disallow lines as required to exclude other files (e.g., login.php). Save and upload to the root directory of your site. If faq.php is not in your root directory, then add the appropriate path information. But the real question for me is why is this happening opn your site? I also have a phpBB forum and this doesn't happen here... something unusual happening with your site or server I think.
Googlebot (Google) 3396 111.90 MB 24 Dec 2004 - 10:46 its used about 70mb in the last 3 days, on a bb that is less than 5mb lol i did try to upgrade the phpbb 2.0 6 to phpbb 2.0.11 but it failed, i never removed it and had it in a diffrent folder, i have a feeling its showing a mix from both the new and the old, and i dont want to loose it by removing one of them. i'll try with the robot.txt thanx for your help. can you get it to read the topics on the forums? using the .txt file? sammie xox
If you didn't remove the files, you should do so (at least the install and contrib folders -- security vulnerabilities). If it didn't work, there's nothjing there for Google to "read" anyway. Then get somne help installing the upgrade as soon as possible before your site gets attacked. Googlebot will index anything on your site NOT "Disallowed".
at least the install and contrib folders -- security vulnerabilities <<i did remove them, and the robot.txt works great ty, no bot for hours snooping in my nooks and crannies thats great ty sweetie
update on this, i now have 1600+ 404's on my site, all come from google and are trying to get to the old FAQ's page it spent 3 days stuck in. proof that googlebot is male if you ask me, just seems to wanna f*** sammie anyway it can
Samantha -- did you remove your robots.txt file? I just tried to check it now and got an error message. It MUST be titled robots.txt (note that robots is plural) and it MUST be placed in the root directoy of your site (i.e., at http://www.asksam2.com/robots.txt), otherwise it isn't doing any good. Also... are you using frames on your site? Spiders sometimes have trouble with those...
Also remove these three lines from your <HEAD> </HEAD> section: <meta name="revisit-after" content="14 days"> <META HTTP-EQUIV="Pragma" CONTENT="no-cache"> <META HTTP-EQUIV="Expires" CONTENT="Thu, 1 Jan 1970 01:00:00 GMT"> Code (markup):
ok i got rid of all them, and the bot.txt i got rid of because google bot never came back. but i have changed the forums, and its sending 1,000's of people to my site to read the phpBB/FAQ.php page that it spent 3 days in. so now i just get 1,000's of faq 404's lol