Wordpress Theme - Find jobs - Jobs search - Self Improvement Articles Directory - Internet Advertising

PDA

View Full Version : Help, Googlebot is "cache"ing my php?=file.htm


kyle422
Feb 27th 2005, 5:13 pm
Google has been deep crawling my site today and I noticed something that scares me. Since I've implimented a rewrite to send mysite .com to www. mysite .com none of my normal pages have been cached. It's only "cache"ing my passthru pages. Sample below.
http://64.233.161.104/search?q=cache:http%3A//www.myfloridahomesforsale.com//passthru.php%3Ffile%3Dinternational.htm
I'm worried that my normal pages are no longer going to be cached and therefore dropped from Google's index and lose ultimately my PR. Am I correct in thinking this? Is there anything I can do? Help! :confused:
My .htacces code is below.

AddHandler application/x-httpd-php .htm .html
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} ^(.*).htm [NC,OR]
RewriteCond %{REQUEST_FILENAME} ^(.*).html [NC]
RewriteRule ^(.*) /passthru.php?file=$1
ErrorDocument 404 /custom404.htm
</IfModule>

RewriteEngine on
RewriteCond %{HTTP_HOST} !^www\.myfloridahomesforsale\.com [NC]
RewriteRule ^(.*) http://www.myfloridahomesforsale.com/$1 [L,R=301]
RewriteRule ^index.htm$ http://www.myfloridahomesforsale.com [L,R=301]

honey
Feb 27th 2005, 5:51 pm
I would personally recommend not to redirect domain.com to www . domain .com !! A big and nice web directory was banned / pernalized in google for this. My 2c. Stay away from this for now.

digitalpoint
Feb 27th 2005, 5:54 pm
I would personally recommend not to redirect domain.com to www . domain .com !! A big and nice web directory was banned / pernalized in google for this. My 2c. Stay away from this for now.I *really* doubt it. It's pretty common practice. Even google.com does it.

ResaleBroker
Feb 27th 2005, 6:17 pm
I would personally recommend not to redirect domain.com to www . domain .com !! A big and nice web directory was banned / pernalized in google for this. My 2c. Stay away from this for now.Using a 301 Redirect will not cause a penalty.

kyle422
Feb 27th 2005, 6:18 pm
I *really* doubt it. It's pretty common practice. Even google.com does it.
Sorry I posted in the wrong topic area of the forum Shawn.
Does anyone else have input on this situation?

exam
Feb 27th 2005, 6:37 pm
I *really* doubt it. It's pretty common practice. Even google.com does it.And uses a 302 to do it HTTP/1.1 302 Found :)

kyle422
Feb 27th 2005, 7:11 pm
Is anybody reading the initial post??? I was asking for help.

ResaleBroker
Feb 27th 2005, 9:50 pm
Kyle,
I looked at the pages on Google and then went to your site and looked at individual pages caches and didn't see this happening except on your "International" page.

Are you thinking the "passthru.php" file is creating a problem with Google?

kyle422
Feb 28th 2005, 5:53 am
Kyle,
I looked at the pages on Google and then went to your site and looked at individual pages caches and didn't see this happening except on your "International" page.

Are you thinking the "passthru.php" file is creating a problem with Google?
YES
Here are just a few examples, but I have about 90 pages all being cached with the passthru.php. Before implimenting the passthru my index page was cached everyday as well. Last cache was the 18 of February.
February 27, 2005, 9:24 am
Googlebot IP Address: 66.249.64.38
Googlebot Domain: crawl-66-249-64-38.googlebot.com
Crawler Type: Unknown Crawler
Url Visited: http://myfloridahomesforsale.com/passthru.php?file=palm-bay-florida.htm
February 27, 2005, 9:42 am
Googlebot IP Address: 66.249.64.28
Googlebot Domain: crawl-66-249-64-28.googlebot.com
Crawler Type: Unknown Crawler
Url Visited: http://myfloridahomesforsale.com/passthru.php?file=merritt-island-florida.htm
February 27, 2005, 9:48 am
Googlebot IP Address: 66.249.64.33
Googlebot Domain: crawl-66-249-64-33.googlebot.com
Crawler Type: Unknown Crawler
Url Visited: http://myfloridahomesforsale.com/passthru.php?file=home-buyers.htm
February 27, 2005, 9:49 am
Googlebot IP Address: 66.249.64.38
Googlebot Domain: crawl-66-249-64-38.googlebot.com
Crawler Type: Unknown Crawler
Url Visited: http://myfloridahomesforsale.com/passthru.php?file=suntree-florida.htm
February 27, 2005, 9:57 am
Googlebot IP Address: 66.249.71.28
Googlebot Domain: crawl-66-249-71-28.googlebot.com
Crawler Type: Unknown Crawler
Url Visited: http://myfloridahomesforsale.com/passthru.php?file=rockledge-florida.htm
February 27, 2005, 10:09 am
Googlebot IP Address: 66.249.64.66
Googlebot Domain: crawl-66-249-64-66.googlebot.com
Crawler Type: Unknown Crawler
Url Visited: http://myfloridahomesforsale.com/passthru.php?file=lake-washington-florida.htm

Diamondbacks
Feb 28th 2005, 7:30 am
I can't see how adding a rewrite will cause a problem. :confused:

Diamondbacks
Feb 28th 2005, 9:55 am
Has anyone else had this occur with the Coop?

Kyle, you might need to start a thread in the Co-op Advertising Network (http://forums.digitalpoint.com/forumdisplay.php?f=34) forum asking for help. :confused:

kyle422
Feb 28th 2005, 1:34 pm
Has anyone else had this occur with the Coop?

Kyle, you might need to start a thread in the Co-op Advertising Network (http://forums.digitalpoint.com/forumdisplay.php?f=34) forum asking for help. :confused:
I started this in the Google forum area and Shawn moved it here. Oh well. I am hoping someone will see it here and give me some opinions. :)
Any super cool and super smart mod-rewrite guys or girls have any feedback? Please.

J.D.
Feb 28th 2005, 9:58 pm
Kyle, I can see only one page from your website containing passthru (Google this: inurl:www.myfloridahomesforsale.com/passthru) - international.htm

Looking at your rewrite ruleset, I don't see how anyone would even see passthru.php - it's completely hidden from the outside world. So, the question is, how did Google find out about passthru.php in the first place? I can only think of a link your had for a short time on your website (like a typo or something) and Google picked it up. Another possibility that there's a link somewhere (like here), that Google will pick up, ending up indexing your internal implementation page.

J.D.

kyle422
Mar 1st 2005, 7:54 pm
Thanks for the input J.D. You are correct about the only page that was cached was the international page with the passsthru. In the meantime, I think I worked out a solution with a Mod rewrite change.
Kyle

www.ebro-fishing.com
Dec 30th 2005, 9:37 am
How long does google take to crawl new sites and index them? I've heard rumors that sites can be crawled but not listed for months. is this true?

Thanks,
John.
http://www.ebro-fishing.com