double and triple forward slashes in Googlebot requests

Hello,

I am getting some odd crawls from Googlebot, it is looking for stuff like domain.tld//folder/file.html and domain.tld///folder/file.html.

I think the cause is older files that I did not link absolutely from - I used ../file.html and ../folder/file.html. I am fixing that as fast as possible.

But in the meantime, I need to stop Googlebot from crawling these extra front slash urls, or 301 it back to the correct spot.

Here's why: http://www.google.com/search?q=allinurl:www.netmidwest.com

If you go to the second page, click on the omitted results link, things look a little better, but it is obvious I am being shot down for it as we speak.

My own attempts at mod_rewrite failed. Searches in other forums (there is some talk at WMW) did not seem to fit my needs. At least one guy got left hanging. One disallowed the ///files.html in robots.txt, but I am not sure how Google would see that, especially since I have not seen this behavior before.

I don't want to go into httpd.conf and risk messing up other sites... things were fine until recently. Seems google has changed something in the crawl.

I am using a standard rewrite for non-www to www:

RewriteEngine on
Options +FollowSymLinks
<IfModule mod_rewrite.c>
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.domain\.com [NC]
RewriteRule ^(.*) http //www.domain.com/$1 [R=301,L]
</IfModule>
Click to expand...

and it seems as though I should be able to add something easy to it to stop this.

Anyone have a solution?

Log in or Sign up

double and triple forward slashes in Googlebot requests

NetMidWest Peon

Log in or Sign up

double and triple forward slashes in Googlebot requests

NetMidWest Peon

Useful Searches