I'm having issues where pages with trailing slashses are being loaded. the real page is: www.somewhere.com/nothere.html BUT, somehow the seach engines got a bad link, and they are indexing pages like: www.somewhere.com/nothere.html/whatever The URL above loads the nothere.html page instead of a 404 not found error and is causing duplicant content issues on the site. I would like to turn this feature off, but am using apache 1.37, and the AcceptPathInfo is only for 2.0 and later... how can I stop this from happening in apache 1.37? I would like apache to return 404 file not found instead of loading up the page... I'll put up a link to your site for a solid solution.... only clean links, no gambling, pron, etc.... thanks!
create a robots.txt and put in it User-agent: * Disallow: /nothere.html/whatever or if you have mod_rewrite installed you could use a rule like this rewriteRule ^nothere\.html/whatever$ http://www.somewhere.com/nothere.html [R=301,L]
the mod rewrite is probably the best because it will redirect visitors and search engine bots to the correct location using a permanent redirect.
thanks, but this needs to be a global solution, since the html pages, and the text that comes after the real html pages varires greatly....
thanks amnezia, this seems to work very well... can you make an adjustment that would not depend on the .html extension? please PM me with the link....