mod rewrite to avoid duplicate content

Discussion in 'Apache' started by web 24 7, Dec 5, 2005.

  1. #1
    Can someone help me as google is counting a poll page as a duplicate page

    http://www.domain.com/folder/page1.html?action=results&poll_ident=4

    http://www.domain.com/folder/page1.html

    http://www.domain.com/folder/page2.html?action=results&poll_ident=4

    http://www.domain.com/folder/page2.html

    I dont even have the poll on my site anymore so I am really confused how google follows it????

    How can I just permanent redirect the dynamic page to the normal static page?

    and do I have to do it for all thes pages or is there a command that would encompass anything with ?action=results&poll_ident=4 and redirect to the index.html

    thanks
     
    web 24 7, Dec 5, 2005 IP
  2. Alexander

    Alexander Peon

    Messages:
    13
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    maybe this comes from previous domain owner?
    use www.archive.org wayback machine
     
    Alexander, Dec 6, 2005 IP
  3. web 24 7

    web 24 7 Peon

    Messages:
    313
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Hi,

    No I used to have a poll on my pages but when you try to see the results it reloads the page with very little change (only the poll results) I got concerned this would be viewed as a duplicate page so removed it.

    Now 3 months later it still gets indexed by google and I dont get why but it is now a duplicate page.

    ******.html?action=results&poll_ident=4

    I would like to 301 redirect anypage with ?action=results&poll_ident=4

    back to the .html page and so probably halve the number of indexed pages and so remove duplicate content.

    Any redirect, htaccess help on this?
     
    web 24 7, Dec 6, 2005 IP
  4. Alexander

    Alexander Peon

    Messages:
    13
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Sorry, I missed 'anymore' as 'never', english is not my native language.

    As about Google, you can call any static blablabla.html page with any parameters you can imagine, the result will be the same if you don't use any parameters inside. So Google tries to follow old links and get a good results, why to delete them? If you really think duplicates are bad, you need 'removed permanently' redirection for this calls.
     
    Alexander, Dec 7, 2005 IP
  5. web 24 7

    web 24 7 Peon

    Messages:
    313
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #5
    yes, I need to know how to permanently redirect these dynamic pages which are just duplicates of my static html pages.

    So any page with ?action=results&poll_ident=4 (there is no longer a link to them but google still calls tham and indexes them?) is permanently redirected to its .html page.

    I need to set up a command in the .htaccess that redirects all 'poll' pages to its actual page.

    domain.com/folder/page1.html?action=results&poll_ident=4

    domain.com/folder/page1.html

    Any help?
     
    web 24 7, Dec 7, 2005 IP
  6. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #6
    Does each poll have it's own page#.html file, or is the poll_ident=4 part the only part that changes?
     
    Nintendo, Dec 7, 2005 IP
  7. web 24 7

    web 24 7 Peon

    Messages:
    313
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Hi,

    Yea what I had was a poll in my footer which was a php include.

    So most .html pages had this extra placed on it only when the link to the poll results was wanted. Of course the enginers crawl it and they never forget!

    So now there is no poll and no link to follow but I see google robot still calls it and records it as crawled. So it is an exact duplicate to the .html page but with the stuff at the end

    I am hoping to mod my .htaccess to say if anything with this script is called 301 redirect to the .html only version.

    Is that clear or do you want to see the site?
     
    web 24 7, Dec 8, 2005 IP
  8. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #8
    If it's from

    domain.com/folder/page1.html?action=results&poll_ident=4
    to
    domain.com/folder/page1.html

    with the 1 being what changes...

    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^folder/page([^.]+).html?action=results&poll_ident=4$ http://www.domain.com/folder/page$1.html [R=301,L]

    if the 4 in the URL changes and isn't in the new URL, change the 4 to ([^.]+).
     
    Nintendo, Dec 8, 2005 IP
  9. web 24 7

    web 24 7 Peon

    Messages:
    313
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #9
    great, this is on track.

    what if the whole page.html changes

    not just the 1 part of page1

    like

    story.html
    article.html
    sales.html

    all getting the same poll additive.

    and what if its also different folder names?

    because this situation is across all pages across all folders..

    none is main directory but all in one subfolder deep.
     
    web 24 7, Dec 8, 2005 IP
  10. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #10
    You would need a line for every single URL.

    RewriteRule ^folder/ANYTHING.html?action=results&poll_ident=4$ http://www.domain.com/folder/pageNUMBER.html [R=301,L]

    For folders...

    RewriteRule ^([^.]+)/page([^.]+).html?action=results&poll_ident=4$ http://www.domain.com/$1/page$2.html [R=301,L]
     
    Nintendo, Dec 8, 2005 IP
  11. web 24 7

    web 24 7 Peon

    Messages:
    313
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Ok Thanks - looks like its a complicated problem
     
    web 24 7, Dec 9, 2005 IP
  12. xponse

    xponse Peon

    Messages:
    93
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #12
    xponse, Dec 9, 2005 IP
  13. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #13
    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^sports\-categoryid\-([^.]+)\-productid\-([^.]+)\.html$ sports.html?categoryid=$1&productid=$2 [L]
     
    Nintendo, Dec 9, 2005 IP
  14. xponse

    xponse Peon

    Messages:
    93
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #14
    thanks Nintendo,,

    I changed that code as per you advice ,, but the problem is not solved.. :confused:

    I am trying to findout the solution,, If you have any other suggestion, please advice
     
    xponse, Dec 9, 2005 IP
  15. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #15
    Nintendo, Dec 10, 2005 IP