1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Will using sitemap mean that G will index duplicate content?

Discussion in 'Google Sitemaps' started by LineOfSight, Dec 21, 2005.

  1. #1
    I run a site which has a 'text only' link in the toolbar. Users can then browse the site dynamically in a 'text only' version changing their viewing preferences to suit.
    SEMrush
    I've created my sitemap but having had a look through the file I can see links to all the 'normal' pages within the site but also lots of references to the accessibility parser which produces the 'text only' version of the site. The folder this sits in is blocked by the robots.txt file but I'm concerned that G is going come along and index a duplicate version of the site in 'text only' mode following its sitemap.

    Any ideas or thoughts?
     
    LineOfSight, Dec 21, 2005 IP
    SEMrush
  2. stephfoster

    stephfoster Well-Known Member

    Messages:
    567
    Likes Received:
    17
    Best Answers:
    0
    Trophy Points:
    138
    #2
    I'm pretty sure Google will still obey your robots.txt file.
     
    stephfoster, Dec 21, 2005 IP
  3. LineOfSight

    LineOfSight The Doctor

    Messages:
    125
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    So even though the path appears in the G site map - when it comes to indexing it should ignore the 'text only' link?
     
    LineOfSight, Dec 21, 2005 IP
  4. mdvaldosta

    mdvaldosta Peon

    Messages:
    4,079
    Likes Received:
    362
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Put nofollow tags on on the links ponting to whichever version you don't want seen, and put noindex in the head of the pages you don't want looked at. Those, along with the robots.txt, should ensure the pages don't get cached.
     
    mdvaldosta, Dec 21, 2005 IP
  5. LineOfSight

    LineOfSight The Doctor

    Messages:
    125
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    In an ideal world this is what I'd do. Unfortunately the 'text only' option is a dynamic interpretation of the static page, stripping out all the HTML and presenting the page back in an accessible format. However, thinking it through there is no reason why those tags cannot be inserted during the page creation process. Thanks
     
    LineOfSight, Dec 21, 2005 IP