200 instead of 404 redirect problem

Discussion in 'Apache' started by PokerBonus, Jul 30, 2011.

  1. #1
    I have problem with one of my sites, if someone types in one of my pages in a subfolder with some extension or even a new folder it returns the original page as a 200.

    For example

    www.example.com/folder/thepage.html

    is correct, but if someone types in

    www.example.com/folder/thepage.html/anotherpage.html
    or
    www.example.com/folder/thepage.html?someweirdcode

    it returns a copy of the original 'thepage' but in the cases where a '/' is in the errant url without the ssi includes as it is effectively in a different folder so the paths break.

    How can I force the server to return a 404 error when a non-existent url is requested rather than effectively 'make it up'?

    This is causing significant problems with search engines as they are taking these pages as duplicates of the original and some of these ghost pages have now been cached and google is repeatedly crawling them. I believe these url's stem from scraper sites as there are so many of them. I have had a sharp decrease in traffic recently and I think google is penalizing the site for duplicate content.

    Can anyone point me in the right direction to sort this out?
     
    PokerBonus, Jul 30, 2011 IP