I have old duplicate pages that are not linked within circulation of the site anymore but are still indexed because they are still available and the site is running on a script so they still work. Now that I've made changes... This is causing duplicate content because they are indexing the same pages with a slightly different URL. Will they ever de-index the old pages, I mean, they aren't even in circulation in the site anymore.. Will I be penalized??
Can you work with this ? http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
You won't be penalized from my understanding, but they won't index the pages. Wordpress blogs are actually set up so that they may show the same content for different urls within a blog. It's best if you remove those pages
Thank you joebert, that helped a lot. My one worry is... Well... I can't remove these duplicate pages with different URLS, they will always be active, they are not in circulation with the site anymore but Google still uses them. Will they EVER go away? I mean, eventually they should be de-indexed since the PR flow will only go to the ones in circulation with the site and they will take over? What do you think will happen?
From Google Webmaster Central Blog: "We (Google) now support a format that allows you to publicly specify your preferred version of a URL. If your site has identical or vastly similar content that's accessible through multiple URLs, this format provides you with more control over the URL returned in search results. It also helps to make sure that properties such as link popularity are consolidated to your preferred version."
I saw that, but I'm using a script and don't think I can make the correct URL appear in this new meta trick. With me, it will generate THE URL from whatever page you're at. What you are suggesting is usually good for scripts with SESSID's from php scripts that Google can pick up on. Thank you though!