Mortgages - BlackBerry - Credit Cards - McDonalds - Cadillac

PDA

View Full Version : googlebot stuck?


samantha pia
Dec 22nd 2004, 12:42 am
is the googlebot stuck in my forum's? i have 27 members and its been on it for the last 3 hours, and never seems to get anywhere.

Guest 21 Dec 2004 11:39 pm 21 Dec 2004 11:39 pm Forum index 66.249.66.133
Guest 21 Dec 2004 11:38 pm 21 Dec 2004 11:38 pm Logging on 66.249.66.133
Guest 21 Dec 2004 11:36 pm 21 Dec 2004 11:36 pm Forum index 66.249.66.133


Guest 22 Dec 2004 01:34 am 22 Dec 2004 01:34 am Forum index 66.249.66.133
Guest 22 Dec 2004 01:33 am 22 Dec 2004 01:33 am Viewing FAQ 66.249.66.133
Guest 22 Dec 2004 01:32 am 22 Dec 2004 01:32 am Forum index 66.249.66.133
Guest 22 Dec 2004 01:31 am 22 Dec 2004 01:31 am Logging on 66.249.66.133

anton-io!
Dec 22nd 2004, 12:48 am
Had the same thing the other day ... have a script that sends me an email when googlebot visits a page ... well ... over 200 emails later ...

samantha pia
Dec 22nd 2004, 12:57 am
Googlebot (Google) 523 12.14 MB 21 Dec 2004 - 23:31 my DB for the forums is 567kb the whole site is only 13mb should it take this long?
i dont get it??? why 3 hours on a forum with less than 500 posts on it? why keep trying to logon when it cant make an account? is it really that stupid?

crazyhorse
Dec 22nd 2004, 3:21 am
Now i know it never showed up on mine , its stuck on yours. :p

T0PS3O
Dec 22nd 2004, 3:23 am
Haha yeah thanks Sam, you just cut out the whole internet of their daily caching. This is going to cost us millions in lost revenue...

samantha pia
Dec 22nd 2004, 4:23 am
Guest 22 Dec 2004 05:20 am 22 Dec 2004 05:20 am Forum index 66.249.66.133
Guest 22 Dec 2004 05:19 am 22 Dec 2004 05:19 am Forum index 66.249.66.133
Guest 22 Dec 2004 05:18 am 22 Dec 2004 05:18 am Forum index 66.249.66.133
Guest 22 Dec 2004 05:17 am 22 Dec 2004 05:17 am Forum index 66.249.66.133
Guest 22 Dec 2004 05:16 am 22 Dec 2004 05:16 am Forum index 66.249.66.133
Guest 22 Dec 2004 05:16 am 22 Dec 2004 05:16 am Forum index 66.249.66.133


its still stuck this is about 6 hours now i think it started yesterday :eek:

anthonycea
Dec 22nd 2004, 4:37 am
GoogleBot can't get enough of your love, Bad Company fans understand this :p :)

samantha pia
Dec 22nd 2004, 6:20 am
Guest 22 Dec 2004 07:15 am 22 Dec 2004 07:15 am Logging on 66.249.66.133
Googlebot (Google) 914 20.65 MB 22 Dec 2004 - 05:16

8 hours later 20mb off a forum that only has 500 posts on it, and a database of 565kb, whats going on? should i boot googlebot off my forum or just leave it? i guess they have more than just one bot huh?

SEbasic
Dec 22nd 2004, 6:27 am
No, Don't boot it off...

You don't want to be doing that ;)

T0PS3O
Dec 22nd 2004, 6:32 am
Can you quickly stick a few links up to 2 new sites of mine? :)

centrium
Dec 22nd 2004, 6:35 am
hehehe guess what I've got the same issue, Google has been on mysite since before 8 this morning, and its still going strong!!!

hehe if anyone wants to sign up and have there links in the signature whilst the bots about, go for it!!! hehe :)

samantha pia
Dec 22nd 2004, 6:58 am
just go add links to a thread i dont mind, you dont even need to register to post on my site, the bots still doing its stuff i think he locked himself in and cant find the key to get out again

Guest 22 Dec 2004 07:56 am 22 Dec 2004 07:56 am Forum index 66.249.66.133
Guest 22 Dec 2004 07:55 am 22 Dec 2004 07:55 am Searching forums 66.249.66.133
Guest 22 Dec 2004 07:53 am 22 Dec 2004 07:53 am Searching forums 66.249.66.133

see, that or he found my stash of toys

the funny thing it, in the last 8+ hours the bot has not got on any of the topics yet. its been in faq's and everywhere, just not in a topic yet

Fishing Forum
Dec 22nd 2004, 7:05 am
googles just doing its thing had 13 this morning going over everything

Just wait till yahoo comes along - it hits on mass

Remember there are alot of pages to a forum script so it will be trying to index everthing not just the posts

centrium
Dec 22nd 2004, 7:21 am
LOL SAMANTHA!!!

Yes I guess both of us can be grateful its spending such a long time on our forums!!! Guess I only noticed it since I made my forum google friendly, its looks like it really has made friends now!!!

samantha pia
Dec 22nd 2004, 7:32 am
mmm i wonder if the bots having a kinky turn with the scripts??? :rolleyes:

centrium
Dec 22nd 2004, 7:46 am
ha ha

As long as they don't mess up anything when there up to mischief!!!! Still the bot is welcome to play around all day long in my forums!! I had a wee look at yoru forums Sam

samantha pia
Dec 22nd 2004, 8:02 am
as you see not very much to have a bot in them for over 8 hours,

centrium
Dec 22nd 2004, 8:07 am
Yes, but I wouldnt complain really!

Yea I'm now thinking thou what if the sysadmin guys at work see what I went too in the weblogs!!!, can see the jokes now about going to a site on that topic!!! hehe Ach well its nearly XMAS!!

samantha pia
Dec 22nd 2004, 8:08 am
9 hours+ and the bots just left

younghistorians
Dec 22nd 2004, 9:04 am
Had the same thing the other day ... have a script that sends me an email when googlebot visits a page ... well ... over 200 emails later ...

Care to share that script? :)

janecompersnews
Dec 22nd 2004, 10:47 am
I've got the same on my forum. My site is private, google (and the public) can only see a couple of pages but it's been on site for over 9 hours now and it's still there.

Jane

samantha pia
Dec 22nd 2004, 1:30 pm
i dont believe this, the googlebot is back in my forums, after over 9 hours today why would it leave and then come back again?

anthonycea
Dec 22nd 2004, 1:37 pm
Second times a charm? :confused:

Once is not enough? :o

You are a great cook and GoogleBot is still hungry? :D

Fishing Forum
Dec 22nd 2004, 1:38 pm
Trust me it not a big deal with a forum

Only worry if you do not get them ;)

Have a look at who is online on this forum

anthonycea
Dec 22nd 2004, 1:41 pm
Sam go to the menu bar on Digital Point and click on quick links, then click on who's online and go through all the pages you see.

You will find that Google and Yahoo and a lot of other bots live here at Digital Point all the time. :confused:

I have tried to get Shawn to kill them but for some reason he likes the spiders and they like him. :mad:

Go figure Sam :confused:

samantha pia
Dec 22nd 2004, 2:00 pm
MSIE 6.0; Windows 98; YComp 5.0.8.6; yplus 1.0)" 66.249.66.133 - - [22/Dec/2004:12:35:20 -0800] "GET /v-web/bulletin/bb/index.php?sid=7bab24f6d2c0f8eb5f0739b7a3be08cc HTTP/1.1" 200 31769 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.140.226.193 - -

ha ha googlebot uses windows 98 lol

DarrenC
Dec 22nd 2004, 2:19 pm
just be glad googlebot doesn't use windows 3.1!

younghistorians
Dec 22nd 2004, 2:40 pm
just be glad googlebot doesn't use windows 3.1!

Why? That wouldn't matter...

Refrozen
Dec 22nd 2004, 2:42 pm
No, he isn't stuck, he's just having a little too much fun. :D

symetrix
Dec 22nd 2004, 2:51 pm
There is a worm going around that defaces message boards. G crawled and cached lots of these defaced pages which have since been fixed. They probably dumped any page from their cache that had a forum-like URL and are re-indexing them all to get clean copies.

Just a theory of course.

If its a problem for you, you can email G and ask them to rate-limit their crawls of your site.

samantha pia
Dec 24th 2004, 11:00 am
Guest Fri Dec 2004 11:55 am Fri Dec 2004 11:55 am Viewing FAQ 66.249.66.133
Guest Fri Dec 2004 11:54 am Fri Dec 2004 11:54 am Viewing FAQ 66.249.66.133
Guest Fri Dec 2004 11:54 am Fri Dec 2004 11:54 am Viewing FAQ 66.249.66.133
Guest Fri Dec 2004 11:53 am Fri Dec 2004 11:53 am Viewing FAQ 66.249.66.133
Guest Fri Dec 2004 11:53 am Fri Dec 2004 11:53 am Viewing FAQ 66.249.66.133
Guest Fri Dec 2004 11:52 am Fri Dec 2004 11:52 am Viewing FAQ 66.249.66.133


ok 3 days is to much to be indexing a bb FAQ page and its still doing it, how do i make a robot.txt to stop it going to the junk pages? and get it to read the topics? it aint looked at one topic in 3 days, it just sits there looking at ???? crap. it keep trying to logon it to. :mad:

minstrel
Dec 24th 2004, 11:30 am
Using a text editor (like notepad), create a file named robots.txt and add these lines:

User-agent: *
Disallow: faq.php


Add other Disallow lines as required to exclude other files (e.g., login.php).
Save and upload to the root directory of your site.

If faq.php is not in your root directory, then add the appropriate path information.

But the real question for me is why is this happening opn your site? I also have a phpBB forum and this doesn't happen here... something unusual happening with your site or server I think.

samantha pia
Dec 24th 2004, 12:05 pm
Googlebot (Google) 3396 111.90 MB 24 Dec 2004 - 10:46

its used about 70mb in the last 3 days, on a bb that is less than 5mb lol i did try to upgrade the phpbb 2.0 6 to phpbb 2.0.11 but it failed, i never removed it and had it in a diffrent folder, i have a feeling its showing a mix from both the new and the old, and i dont want to loose it by removing one of them.

i'll try with the robot.txt thanx for your help. can you get it to read the topics on the forums? using the .txt file?

sammie xox

samantha pia
Dec 24th 2004, 12:08 pm
maybe googlebot is trying to work out its sexuality thats what my sites about lol

minstrel
Dec 24th 2004, 11:26 pm
its used about 70mb in the last 3 days, on a bb that is less than 5mb lol i did try to upgrade the phpbb 2.0 6 to phpbb 2.0.11 but it failed, i never removed it and had it in a diffrent folder, i have a feeling its showing a mix from both the new and the old, and i dont want to loose it by removing one of them.
If you didn't remove the files, you should do so (at least the install and contrib folders -- security vulnerabilities). If it didn't work, there's nothjing there for Google to "read" anyway. Then get somne help installing the upgrade as soon as possible before your site gets attacked.

i'll try with the robot.txt thanx for your help. can you get it to read the topics on the forums? using the .txt file?
Googlebot will index anything on your site NOT "Disallowed".

samantha pia
Dec 25th 2004, 1:59 am
at least the install and contrib folders -- security vulnerabilities <<i did remove them, and the robot.txt works great ty, no bot for hours snooping in my nooks and crannies
thats great ty sweetie

samantha pia
Jan 9th 2005, 5:15 am
update on this, i now have 1600+ 404's on my site, all come from google and are trying to get to the old FAQ's page it spent 3 days stuck in. :(
proof that googlebot is male if you ask me, just seems to wanna f*** sammie anyway it can :D

minstrel
Jan 9th 2005, 12:09 pm
Samantha -- did you remove your robots.txt file? I just tried to check it now and got an error message.

It MUST be titled robots.txt (note that robots is plural) and it MUST be placed in the root directoy of your site (i.e., at http://www.asksam2.com/robots.txt), otherwise it isn't doing any good.

Also... are you using frames on your site? Spiders sometimes have trouble with those...

minstrel
Jan 9th 2005, 12:22 pm
Also remove these three lines from your <HEAD> </HEAD> section:

<meta name="revisit-after" content="14 days">
<META HTTP-EQUIV="Pragma" CONTENT="no-cache">
<META HTTP-EQUIV="Expires" CONTENT="Thu, 1 Jan 1970 01:00:00 GMT">

samantha pia
Jan 10th 2005, 5:43 am
Also remove these three lines from your <HEAD> </HEAD> section:

<meta name="revisit-after" content="14 days">
<META HTTP-EQUIV="Pragma" CONTENT="no-cache">
<META HTTP-EQUIV="Expires" CONTENT="Thu, 1 Jan 1970 01:00:00 GMT">
ok i got rid of all them, and the bot.txt i got rid of because google bot never came back. but i have changed the forums, and its sending 1,000's of people to my site to read the phpBB/FAQ.php page that it spent 3 days in. so now i just get 1,000's of faq 404's lol

minstrel
Jan 10th 2005, 7:31 am
i have changed the forums, and its sending 1,000's of people to my site to read the phpBB/FAQ.php page that it spent 3 days in. so now i just get 1,000's of faq 404's lol
But that's not good -- you should redirect those to either a custom 404 page or to your index page.

samantha pia
Jan 11th 2005, 6:04 am
i have a custom 404 page with the sites logo and menu, how would i send them to it?

URL Error Referers Hits
/v-web/bulletin/bb/faq.php 615 -
/v-web/bulletin/bb/viewtopic.php 127
/v-web/bulletin/bb/viewtopic.php?p=132&sid=76f314b903e458a2da56cb7b3926aea1

i get lots like that, 1600+ in 1 day :(
whats the code to do it? where do i put it? and whats best to do? send them to the custom 404 page or the index page :(
sammie xox

minstrel
Jan 11th 2005, 8:23 am
As I understand it, you've now renamed the FAQ page to something else, correct?

First, re-upload that robots.txt page to exclude spiders from indexing the pages you don't want indexed. Then we can check it for errors.

Second, to direct to a custom 404 error page, you need an entry in your .htaccess file -- if your host allows this. It would look like this:

ErrorDocument 404 /error.htm
where for "error.htm" you would substitute whatever you've called your custom error page.

If your host doesn't allow you to edit the .htaccess file (if you're using FrontPage with FP extensions, be careful: this line needs to be added to the end of the existing file), or if you're not on a *nix/Apache server, then you should ask the host to enable the custom error page for you.

But note that of the "error hits" you mention:

/v-web/bulletin/bb/faq.php 615 -
/v-web/bulletin/bb/viewtopic.php 127
/v-web/bulletin/bb/viewtopic.php?p=132&sid=76f314b903e458a2da56cb7b3926aea1
the 2nd and 3rd ones are looking for specific forum posts, possibly posts that have been deleted?

samantha pia
Jan 11th 2005, 11:57 am
i have access to the .htaccess files. google bot never came back once i put the robot.txt file on the server, i have installed a newer phpbb forum but not in the old v-web dir where the old one was, and has now been moved and stored on the server, but not in use anymore. the .htaccess file is full of Redirect 301 /page3.html http://www.asksam2.com/abusedepression/index.php
Redirect 301 /childabuse/index.html http://www.asksam2.com/abusedepression/childabuse.php
and this one ErrorDocument 404 /404.htm

can i just redirect all of them to the new forums?

minstrel
Jan 11th 2005, 8:14 pm
Sure -- redirect them to the new forum and then put in a new 404 page and a new robots.txt file.

phrozen_ra
Jan 12th 2005, 3:38 am
Had a forum once.... it was stuck on my pages too... at least that's why i believed... but NO

But tell me this... are you running google ads on your site? if so... then it's not the google bot, but the bot that figures out what kind of content you have

If I am right... then you can filter it out of your logs... don't... by any means, don't boot it

samantha pia
Jan 12th 2005, 5:43 am
Sure -- redirect them to the new forum and then put in a new 404 page and a new robots.txt file.
how? i tried to do the redirect by adding this line: Redirect 301 /v-web/bulletin/bb/faq.php http://www.asksam2.com/index.php just to send them back to the main page of the site, but there are 2 buttons in the .htaccess thingy, one says update file, the other says reset. i hit the update one, and nothing happens, :( and i think if i hit the reset one, it will deleted all the stuff in it :( so i aint tried that yet. :o


and i dont have any ads on my site what so ever, so it was just the googlebot stuck for 3 days, i just keep getting 404's now from the old forums, 2,400+ to date, thats 800 in one day.

minstrel
Jan 12th 2005, 8:19 am
Sounds like you're using CPanel to do this. Contact your host to ask why it isn't working.

samantha pia
Jan 12th 2005, 8:33 am
the problem is it is working with the other 100/150 301 redirects, its just not doing it with the new one i tried to do, it shows an old custom 404 page. i guess its in a script somewhere, thats doing it

minstrel
Jan 12th 2005, 8:37 am
Yikes. Maybe there's a limit? All those redirects -- I'm no expert on servers but that must be slowing things down a tad... why do you need so many?

samantha pia
Jan 12th 2005, 8:55 am
the old site and new site work diffrently, the old one was just html, with a java left hand menu, the new site is all php, and the new site has been setup much better than the old site, so to carry on getting the old sites SE traffic, each old page was redirected to the new page, after a few months of this, i think it would be safe to delete the old site, but the new site has new forums and chatroom and could not use the old v-web and had to be placed in a dir in another place on the site.

minstrel
Jan 12th 2005, 9:07 am
You can replace the 150 lines with just one.

Put this line in tyhe .htaccess file at the old site:

redirect 301 / http://www.newsite.com/
where "http://www.newsite.com/" is of course replaced with the URL for your new site.

samantha pia
Jan 12th 2005, 9:17 am
the new site was rebuilt 6 weeks ago, the old site is still on the same server, the url is the samefor both, but the index page is now for the new site, but the old sites pages still get hits, so they are redirected to the new pages in the new dir's for the new site.

samantha pia
Jan 12th 2005, 9:24 am
i know i am hard work. wanna see inside my head? not much in there you know, lol
sorry if i am not explaining it right, its just so confusing.

minstrel
Jan 12th 2005, 6:06 pm
:D My point is the phenomenon you saw where Googlebot seemed to be trapped may have something to do all those redirects and having it run back and forth and everywhere.

1. delete all the redirects and just put in the one above -- that will redirect all requests for an old forum page to the new forum -- then you can delete the old forum
2. get someone to help you update your forum to 2.0.11 -- I know the upgrade installation program is buggy but if you don't it's only a matter of time until you lose your forum
3. once it's installed, delete the install folder and the contribute folder

samantha pia
Jan 13th 2005, 11:57 am
ok minstel, i cant redirect them like you said, but taking in what you said, made me thing big time, enough to get a headache. but i found a way round the redirect problem with them, ie i get 4 main 404's from the following links:
/v-web/bulletin/bb/faq.php 627 404's
/v-web/bulletin/bb/login.php 388 404's
/v-web/bulletin/bb/viewtopic.php 127 404's
/v-web/bulletin/bb/index.php 392 404's
/v-web/bulletin/bb/profile.php 65 404's

now i fixed them all, i fixed them by placing a new site page on female sexuality with the main and sub menus on it, so people will see the following if they enter the site via any of the above 404's
http://asksam2.com/v-web/bulletin/bb/viewtopic.php

but without your help, i would never have sat down all day and thought about it, and i am very chuffed i fixed it,
so a great be kiss and thank you form sammie xoxox

minstrel
Jan 13th 2005, 6:51 pm
Uh-oh. I'm really sorry about the headache -- not what I intended but apparently I've had that effect on women before :D

I'm not sure I understand why you can't do the one redirect but if it's working, that's what matters, right?

Good luck with Googlebot in the future :eek:

And thanks for the Sammie kiss :D

minstrel
Jan 13th 2005, 7:05 pm
Actually, looking at that "Sammie kiss" thing, I'm reminded of a website a friend created called "Sammy Pies" -- they are dog treats :eek:

samantha pia
Jan 22nd 2005, 12:08 pm
lol i just love giving kisses http://asksam2.com/phpBB2/images/smiles/icon_kiss.gif