Help, Googlebot is "cache"ing my php?=file.htm

Discussion in 'Apache' started by kyle422, Feb 27, 2005.

  1. #1
    Google has been deep crawling my site today and I noticed something that scares me. Since I've implimented a rewrite to send mysite .com to www. mysite .com none of my normal pages have been cached. It's only "cache"ing my passthru pages. Sample below.
    http://64.233.161.104/search?q=cach...sale.com//passthru.php?file=international.htm
    I'm worried that my normal pages are no longer going to be cached and therefore dropped from Google's index and lose ultimately my PR. Am I correct in thinking this? Is there anything I can do? Help! :confused:
    My .htacces code is below.

    AddHandler application/x-httpd-php .htm .html
    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteCond %{REQUEST_FILENAME} ^(.*).htm [NC,OR]
    RewriteCond %{REQUEST_FILENAME} ^(.*).html [NC]
    RewriteRule ^(.*) /passthru.php?file=$1
    ErrorDocument 404 /custom404.htm
    </IfModule>
    
    RewriteEngine on
    RewriteCond %{HTTP_HOST} !^www\.myfloridahomesforsale\.com [NC]
    RewriteRule ^(.*)  http://www.myfloridahomesforsale.com/$1 [L,R=301]
    RewriteRule ^index.htm$  http://www.myfloridahomesforsale.com [L,R=301]
    Code (markup):

     
    kyle422, Feb 27, 2005 IP
  2. honey

    honey Prominent Member

    Messages:
    15,555
    Likes Received:
    712
    Best Answers:
    0
    Trophy Points:
    325
    #2
    I would personally recommend not to redirect domain.com to www . domain .com !! A big and nice web directory was banned / pernalized in google for this. My 2c. Stay away from this for now.
     
    honey, Feb 27, 2005 IP
  3. digitalpoint

    digitalpoint Overlord of no one Staff

    Messages:
    38,334
    Likes Received:
    2,613
    Best Answers:
    462
    Trophy Points:
    710
    Digital Goods:
    29
    #3
    I *really* doubt it. It's pretty common practice. Even google.com does it.
     
    digitalpoint, Feb 27, 2005 IP
  4. ResaleBroker

    ResaleBroker Active Member

    Messages:
    1,665
    Likes Received:
    50
    Best Answers:
    0
    Trophy Points:
    90
    #4
    Using a 301 Redirect will not cause a penalty.
     
    ResaleBroker, Feb 27, 2005 IP
  5. kyle422

    kyle422 Peon

    Messages:
    290
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Sorry I posted in the wrong topic area of the forum Shawn.
    Does anyone else have input on this situation?
     
    kyle422, Feb 27, 2005 IP
  6. exam

    exam Peon

    Messages:
    2,434
    Likes Received:
    120
    Best Answers:
    0
    Trophy Points:
    0
    #6
    And uses a 302 to do it
    :)
     
    exam, Feb 27, 2005 IP
  7. kyle422

    kyle422 Peon

    Messages:
    290
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Is anybody reading the initial post??? I was asking for help.
     
    kyle422, Feb 27, 2005 IP
  8. ResaleBroker

    ResaleBroker Active Member

    Messages:
    1,665
    Likes Received:
    50
    Best Answers:
    0
    Trophy Points:
    90
    #8
    Kyle,
    I looked at the pages on Google and then went to your site and looked at individual pages caches and didn't see this happening except on your "International" page.

    Are you thinking the "passthru.php" file is creating a problem with Google?
     
    ResaleBroker, Feb 27, 2005 IP
  9. kyle422

    kyle422 Peon

    Messages:
    290
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #9
    YES
    Here are just a few examples, but I have about 90 pages all being cached with the passthru.php. Before implimenting the passthru my index page was cached everyday as well. Last cache was the 18 of February.
    February 27, 2005, 9:24 am
    Googlebot IP Address: 66.249.64.38
    Googlebot Domain: crawl-66-249-64-38.googlebot.com
    Crawler Type: Unknown Crawler
    Url Visited: http://myfloridahomesforsale.com/passthru.php?file=palm-bay-florida.htm
    February 27, 2005, 9:42 am
    Googlebot IP Address: 66.249.64.28
    Googlebot Domain: crawl-66-249-64-28.googlebot.com
    Crawler Type: Unknown Crawler
    Url Visited: http://myfloridahomesforsale.com/passthru.php?file=merritt-island-florida.htm
    February 27, 2005, 9:48 am
    Googlebot IP Address: 66.249.64.33
    Googlebot Domain: crawl-66-249-64-33.googlebot.com
    Crawler Type: Unknown Crawler
    Url Visited: http://myfloridahomesforsale.com/passthru.php?file=home-buyers.htm
    February 27, 2005, 9:49 am
    Googlebot IP Address: 66.249.64.38
    Googlebot Domain: crawl-66-249-64-38.googlebot.com
    Crawler Type: Unknown Crawler
    Url Visited: http://myfloridahomesforsale.com/passthru.php?file=suntree-florida.htm
    February 27, 2005, 9:57 am
    Googlebot IP Address: 66.249.71.28
    Googlebot Domain: crawl-66-249-71-28.googlebot.com
    Crawler Type: Unknown Crawler
    Url Visited: http://myfloridahomesforsale.com/passthru.php?file=rockledge-florida.htm
    February 27, 2005, 10:09 am
    Googlebot IP Address: 66.249.64.66
    Googlebot Domain: crawl-66-249-64-66.googlebot.com
    Crawler Type: Unknown Crawler
    Url Visited: http://myfloridahomesforsale.com/passthru.php?file=lake-washington-florida.htm
     
    kyle422, Feb 28, 2005 IP
  10. Diamondbacks

    Diamondbacks Peon

    Messages:
    107
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #10
    I can't see how adding a rewrite will cause a problem. :confused:
     
    Diamondbacks, Feb 28, 2005 IP
  11. Diamondbacks

    Diamondbacks Peon

    Messages:
    107
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Has anyone else had this occur with the Coop?

    Kyle, you might need to start a thread in the Co-op Advertising Network forum asking for help. :confused:
     
    Diamondbacks, Feb 28, 2005 IP
  12. kyle422

    kyle422 Peon

    Messages:
    290
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #12
    I started this in the Google forum area and Shawn moved it here. Oh well. I am hoping someone will see it here and give me some opinions. :)
    Any super cool and super smart mod-rewrite guys or girls have any feedback? Please.
     
    kyle422, Feb 28, 2005 IP
  13. J.D.

    J.D. Peon

    Messages:
    1,198
    Likes Received:
    65
    Best Answers:
    0
    Trophy Points:
    0
    #13
    Kyle, I can see only one page from your website containing passthru (Google this: inurl:www.myfloridahomesforsale.com/passthru) - international.htm

    Looking at your rewrite ruleset, I don't see how anyone would even see passthru.php - it's completely hidden from the outside world. So, the question is, how did Google find out about passthru.php in the first place? I can only think of a link your had for a short time on your website (like a typo or something) and Google picked it up. Another possibility that there's a link somewhere (like here), that Google will pick up, ending up indexing your internal implementation page.

    J.D.
     
    J.D., Feb 28, 2005 IP
  14. kyle422

    kyle422 Peon

    Messages:
    290
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Thanks for the input J.D. You are correct about the only page that was cached was the international page with the passsthru. In the meantime, I think I worked out a solution with a Mod rewrite change.
    Kyle
     
    kyle422, Mar 1, 2005 IP
  15. www.ebro-fishing.com

    www.ebro-fishing.com Guest

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #15
    How long does google take to crawl new sites and index them? I've heard rumors that sites can be crawled but not listed for months. is this true?

    Thanks,
    John.
    http://www.ebro-fishing.com
     
    www.ebro-fishing.com, Dec 30, 2005 IP