Hey All, For the sake of google, I have redesigned my site but I have some problems. I have a website with music. 2 columns. On the left it displays the artists and on the right the songs of a selected artist. On top its got an alphabet A-Z. Now when clicking on this A-Z, it saves the selected artist but changes the letter and left column. Like If I was on say www.domain.com/E/Eminem and I clicked on A, it would be www.domain.com/A/Eminem. This brings me to my question: If google crawled the page, it would be 26 diff urls for same crap. Is there any way to index the page via sitemap only and not crawl?
i think you must block the dublicate content with robots.txt. it will protect you from dublicate content problems
Fix the script itself. It really shouldn't be available in both locations and points to poor coding and a promiscuous URL rewrite. Is it returning 404 headers when the page is not found at all or is your script telling the spiders that the page exists while it actually doesn't? This appears to be the root of the problem... in your script itself. It would save days of writing robots.txt rules for every possible combination that is incorrect and save you from having an over-bloated robots.txt file if the script itself was fixed to simply work properly in the first place.