Robots File

Erind Peon

Messages:: 663

Likes Received:: 15

Best Answers:: 0

Trophy Points:: 0

#1

Hey All,

For the sake of google, I have redesigned my site but I have some problems.

I have a website with music. 2 columns. On the left it displays the artists and on the right the songs of a selected artist. On top its got an alphabet A-Z. Now when clicking on this A-Z, it saves the selected artist but changes the letter and left column. Like If I was on say www.domain.com/E/Eminem and I clicked on A, it would be www.domain.com/A/Eminem. This brings me to my question:

If google crawled the page, it would be 26 diff urls for same crap. Is there any way to index the page via sitemap only and not crawl?

Erind, May 22, 2007 IP

trichnosis Prominent Member

Messages:: 13,785

Likes Received:: 333

Best Answers:: 0

Trophy Points:: 300

#2

i think you must block the dublicate content with robots.txt. it will protect you from dublicate content problems

trichnosis, May 24, 2007 IP

MaxPowers Well-Known Member

Messages:: 264

Likes Received:: 5

Best Answers:: 1

Trophy Points:: 120

#3

Fix the script itself. It really shouldn't be available in both locations and points to poor coding and a promiscuous URL rewrite. Is it returning 404 headers when the page is not found at all or is your script telling the spiders that the page exists while it actually doesn't?

This appears to be the root of the problem... in your script itself. It would save days of writing robots.txt rules for every possible combination that is incorrect and save you from having an over-bloated robots.txt file if the script itself was fixed to simply work properly in the first place.

MaxPowers, Jun 6, 2007 IP

Log in or Sign up

Robots File

Erind Peon

trichnosis Prominent Member

MaxPowers Well-Known Member

Useful Searches