My site used to rank #1 for lots of good keywords. Over the past 18 months, all my pages dropped out of the search index and now only my homepage ranks (mostly for my site name). I have narrowed the reason down to duplicate content. I have placed a meta nofollow tag on thousands of pages, leaving only the top 15 pages of content on my site. I'm wanting to test and make sure it's the duplicate content issue that is affecting my site. How long does it usually take for the duplicate content penalty to be removed?
Duplicate content is not applied as a penalty as many people think. It used to be but it has changed quite a bit. Here is good article that explains how dup content is currently treated: http://www.associatedcontent.com/article/1248069/myths_about_google_duplicate_content.html?cat=15
I had a url canonicalization issue with my joomla site where the dynamic urls were giving each page multiple itemid's which was confusing the search engines into seeing duplicate content. I set up a url rewriting script that solved the issue and sent a reconsideration request to Google through webmaster tools. About 3 weeks after sending the request we gained about 30 pages in the serps for our main keyword. I never got any messages back from Google about the issue so I'm left guessing as to if the URL rewriting alone did the trick or if sending the request to reconsider helped.
Using Meta Nofollow it will take ages. What you need to do is add the Meta Noindex tag. If desired you can set this to Follow so that Google quickly finds all the pages you want to be deindexed. However, this process can takes months on very large sites. If you have specific pages you want to remove from Google's index, then use the URL Removal Tool in Webmaster Tools. Yes, it is time consuming to submit hundreds or even one thousand URLs. But if it is hurting you in the wallet, then it is best to use the URL Removal Tool so that your earnings quickly come back. The time needed to recover from duplicate content is dependent on how quickly the duplicate pages are removed from Google's index. Check the cache date on some of the pages that contain duplicate content. That should give you an idea of how long it will take Google to crawl those pages. That's also assuming you remove the Nofollow Meta Tag and change it to Noindex.
Your site probably just didn't rank well with google's algorithm. After all they do change it quite often.
Reconsideration requests are used for actual penalties, and only after the issues have been cleaned up. Duplicate content is not a "penalty" that requires a Google Webspam employee to verify everything has been corrected. Although sitting at position 950 for primary keywords may sure seem like a penalty. This is a common problem, especially for sites that used to hold many top rankings. Once the duplicate content is removed from Google's index, the good ranks normally return immediately.
Thank you for the informative post. Actually, I did use a Noindex tag instead of Nofollow (my mistake). Now here is the problem. I looked up the last cache date for the pages I want removed and it was November 2. Who knows when Google will crawl it again. So I would like to remove the URLs using the Google Webmaster tools. Here's the problem: I have thousands of pages that need removing (it's an image gallery with over 10,000 images). I have about 250 directories and the only content i want to keep are the index.html files in each directory. The google remove URL tool let's you remove full directories, but is there a way I can tell it to keep the index page? again, thanks for your post. I'm feeling good about getting back in the search engine!!!
once you remove duplicate content and put up a fresh content, google crawl the new content and its remove duplicate penalty.
Unfortunately Google's URL removal tool does not support wildcards. You must enter each URL individually to solve your problem. Trust me, I've been in the same boat with about 1,500 pages and its not fun. I can only imagine 10,000 pages. Depending on the income being lost, you may just want to wait it out. Good luck
This is everything that you will ever want to know about duplicate content. http://googlewebmastercentral.blogspot.com/2008/09/demystifying-duplicate-content-penalty.html