My question is, for those in the know, what does Google consider "duplicate content"? For example, I'm converting one of my old sites to CMS, which requires I painstakingly move my content from static HTML pages over to their PHP counterpart. This results in me having two nearly identical pages of content (the original and the PHP version) on my site. I haven't deleted the old HTML pages yet, and in truth, I don't want to, as they're still getting good traffic from Google. Would Google consider me having two versions of the same contents duplicate content? In all, I would say, there are about 1600 pages. Give me your opinions.
Yeah. What you should do is put a redirect from the HTML page to the PHP page. Google will eventually pick up on this.
I'm sure you'll find answer here: http://forums.digitalpoint.com/forumdisplay.php?f=49 Maybe you should ask NINTENDO member there. He is very helpfule.
Okay, the 301 redirect seems to be the way to go. But I'm still not getting an answer to my question: Is THIS considered duplicate content penalty? Having two copies of the same file (1600 of them) on one site?
Interesting... I have the same issue for one of my sites. However, I included the 301 redirects in the .htaccess file a couple of days ago, and now, all the "outdated" .htm-files have vanished from the Google SERP (as confirmed by google site:thedomaininquestion.zzz). I was, just like you, worrying about duplicate content punishment. Seems like I was punished for the redirects though, as the site droped from the top-10 to the 11-20 result page (page2 with default settings). Dooh! That can, on the other hand, be a consequence of algorithm or PR updates that are the topic of several recent posts here (kinda plausible, as the PR should be transferred to the pages subject to the 301:s).
Yes it is, definately do a redirect with .htaccess from old to new pages. Ultimately if you leave it like that the new will go in to Supp results and you will never get traffic to these pages.
Yeah, I'm doing re-directs now. So the question now is: How long to leave the re-directs? Remember, I have over 1600 pages to move... And what do I do with the old pages? Delete them? I guess I should delete them since they're no longer necessary...
you are doing the right thing, re-directs is the only way! but then don't delete them too soon, let the new pages get indexed, and the old ones get de-indexed, after that delete the old ones. wait for a few months, as these things takes time to happen.
Try also to remove all internal links that point to the htm pages and point them to the new pages. Did you find a pattern for redirection or you had to put 1600 rules
are you keeping the file names same and just changing the extension from .htm to .php? if yes, redirecting 1600 pages through 301 would not be a problem at all. and if you use 301, there wouldn't be any duplicate content
Just leave the redirects in place for quite some time, the amount of time it takes Google to update it's index depends on the page strength. Your stronger pages such as ones with a direct link from the index page can update in a matter of 24 hours while your deep sub-pages may take a few weeks. I hope you managed to find 1 rule to rewrite all your pages.