This article is based on my previous how to eliminate duplicate content written and published at dnseo.net originally. I see this topic being discussed over and over with mixed ideas, hence I thought to publish the article here as well with some slight modifications. I will try to avoid techy talks first because not everyone is tech related and knows the terms and second because I like speaking to normal people in normal language (my English is not perfect, so feel free to correct me). Whats Duplicate Contnt We all know what the word duplicate means, but what some might be missing is how important is to remain unique (is it on Internet or in Real life). People like unique things, people like unique behavior, even the five fingers of the same hand are not unique...so why not be unique in internet too where that is mostly appreciated and valued? Now we know that we love unique things (would you want a pair of shoes that no one else has so you could out stand from the crowd?) See around what shoes does Rand Fishkin wear , and why? So he could outstand from the rest and get the attention, same applies with content. From search engine point of view (and justified) search engines give more value to original content, this does not mean that they ban the duplicate one, but definitively the originality gets the benefits and the attention of the search engines. So if search engines already protect the original author why shouldn't the original author protect himself from himself? WordPress and Duplicate content The wordpress blogging platform is being used more and more and not just as blog but even as CMS thanks to the great features and easy to use system and plugins. But WordPress has a major issue with duplicate content. Lets take the case that you are using full article in homepage and not excerpt. Once you publish the article in your wordpress blog you will automatically find your full article on several places Homepage Articles page Category Page Archives Page Even if the page will end up in secondary page it will be still shown as full post in the secondary page. So we have automatically created 4 duplicate pages without even being aware of it. This will 'confuse' search engines as to which of the pages where the content is should be given more value. The Cure Whats a blog without comments? I guess a dead blog. This might not be the reason that you don't have visitors, and neither might be the reason that your content is not good. It simply means that we (people) are pretty lazy to walk through the blog and find the comment box, we want everything served now and ready in plate. Said this, you want your readers to land on the right page from search engines and not to the archives page or category page if they haven't searched for the archives or category. So what you need to do is eliminate the duplicate content from all around your site and tell the search engines that the posts URL is the one that should receive all the attention and rank for the keywords you are targeting. I have seen many advice you to noindex the categories page or the archives page. Even tho I have to agree with others that archives may be noindexed and kept out of the search engines index (even from the website) since I barely see users surfing based on archives, but rather I see them surfing on categories or tags better, I have to tell you that you should not noindex your categories, absolutely no. Reason? Categories are keyword rich, and as such you get the possibility to rank with each category URL for the respective keywords, secondly who not have more indexed pages in the search engines which will provide backlinks to your homepage? But lets get to the point and start curing your wordpress blog from duplicate content, and we will start from the homepage. homepage This is optional to each blogger, if rather keep the full post on homepage or no. Personally I don't want the full posts in homepage, but rather with the Optional Excerpt feature that wordpress has built in I would create a second (mini article) teaser article that will appear in the homepage. WordPress excerpt will practically use the first 200-300 characters of your original article as excerpt, with the Optional Excerpt (which you can find lower in the writing page) you can write down a completely new excerpt the way you want and keep your posts article 100% original. categories and archives As I earlier mentioned I would prefer to have the categories indexed, as far as it regards archives I would noindex them all the time (I don't even use the archive links in homepage). The use of excerpt as well as in the categories will preserve again the originality of the posts page and its content. What might pop in your head is that we will have the homepage content and the category page content as duplicate? The fact is that in homepage are being displayed all the posts while on the category pages will be displayed only posts assigned to that category. In addition there is a small hack written by Joost De Valk that you could use to keep the homepage content and the category pages content completely different, but in the category page you will see the first 200-300 characters of your posts content. Implementing these hacks and options you will certainly protect the unique content of your post page and will definitively lead the search engines to the right page that deserves the attention. Of course there is the RSS duplicate issue, but you can simply noindex or nofollow them. I might have as well missed out something, so please feel free to complete the list.
Let me see what I can do to help, ok? I'll do my best to go easy on you (since you're a respected member of the community here). Actually, they will de-list the other versions. That's what they do - pick one, and throw away the rest. The one they pick will be the one that gets indexed (and thus referred to searchers), the others won't exist as far as the search engines are concerned. The original author won't always be protected - even from himself. Sometimes the original content will be mis-labeled as being a duplicate, which sucks. As it very well would. And I'm glad you made this point (not everybody listens to me when I tell them about it - it's like looking at a deer that's staring into a set of headlights late at night). YOU HEAR THAT EVERYONE? HE'S RIGHT!!! From a social networking point of view, you would be correct. However, if the content is good enough to get people to link to it, then I'd have to beg to differ. I do tell people to noindex their archives and tags pages though. However, I tell them to do it from their robots.txt file. Any legitimate search engine spider will look for the robots.txt file before it even starts crawling the site in question. By telling the spider where it cannot go, you can better focus it on the pages you want it to crawl and index - in this case your categories and blog posts. Another thing I'd do (and not for SEO purposes either) is write the first 2-300 words (200-300 for those who prefer that format and may be confused that I'm saying "first two words to the first two hundred words") to be as enticing as possible and make them want to click the link to the full article so they can read the rest. That's a copywriting tip though - so I won't go into details here other than to say - lay it all out on the line, tell them what the article or entry is about, and then write it so that if they want to know the rest that they'll have to read more (in other words, take the "five paragraph essay" format you learned in school and turn it over on its head). I don't include archives either - it's just not worth it. However, a link to an archives page can come in handy for those who prefer to search the archives rather than categories (say they remember when an article was written, but they can't remember the title off-hand). That's good if you prefer to set up your blog that way. I personally prefer the excerpts on the home page and category index, so I can serve more types of content to my users - and then of course have the link directly to the article itself so people can get to it. As a matter of fact, you did miss out on something. The /category/ base problem. For those who are not running WordPress 2.5, you're going to have to use the Category Base Killer plugin to remove /category/ from ALL of your internal category based links. Modifying the custom URL structure isn't going to be enough (they'll work on some links, but not all), because the internal links (like on your category and archives pages) will still show /category/ in the URL structure. Using the plugin will take care of this (and is far better than hacking WordPress directly). WordPress 2.5 users apparently can go into the Dashboard and remove the category base directly - though I have yet to personally confirm this (a medical emergency in the family forced me to stay away from home, so I haven't been able to upgrade yet). This way you can remove the /category/ URL while still leaving your categories available for indexing. Who'd have thought?