hello all, on google webmaster tools for a relatively large site (over 300K pages). I get a 404 "not found" error for hundreds of pages. the urls which get the 404 "not found" have letters written in CAPS: domain/Folder/Page/Category this page doesn't get indexed or crawled in G. this page has over 60 incoming links to it. the interesting case here is, that the same page, when written in lower case - is crawled and indexed: domain/folder/page/category there is no duplicate content issue here since one page does not exist on G. tests i've done include running cache,inurl, "text" search commands. site is hosted on windows server, used to be ASP now it is ASPX. any ideas? i haven't found any reference for this error before.
because the foldernames are case sensitive. did you create the sub folders with lower case or upper case?
there might be some pages on your site which are linking to the Caps C version. This happened with me as well, and I removed the caps from linking pages and it was all ok.
While you figure out what's going on you can always create symbolic links from the missing folders to the existing folders if you are on a Linux/UNIX box.