Recently, Google was awarded a patent to address the duplicate content issues. The patent is called 'Methods and apparatus for estimating similarity' and the abstract reads as follows:
I wonder if this is a problem for some bloggers ... I guess the question is: what happens if a site is determined to be similar to another? And possibly worse, what happens if yours is similar to someone else's, but because they are indexed by Google more frequently your site is marked as the duplicate? Would love to know ... Anita
let's just say that this technology is not rock solid yet I have a purely automated blog, and it has 4300 pages indexed in google...
I am going to get ALL OVER that patent later this week... he he... I just finished up with Phrase Based Indexing and Retrieval ( 5 patents in all) http://forums.digitalpoint.com/showthread.php?t=237718 SO I shall get a good run through this one very soon...
What do you think they want to detect PRIMARILY ? Duplicated pages on same site, or Duplicated pages on different sites. My vote is for option 2 .
Actually I read about their policy from their Google blog on Dup Content. From what I read, if the content is written in many languages, than Google does not consider it dup content. Weird I know. They listed other criteria as well for dup content as well on the Google blog.
1. Corollary: If they aren't indexed, they certainly won't rank for anything. 2. Duplicate content is about NOT indexing duplicate pages, so if his site is "automated" (presumably meaning that it consists of feeds from other sources) and he has 4300 pages showing in Google, that would seem to imply 4300 duplicate pages missed by Google's duplicate content filters. That said, I do believe Google is getting better at finding and eliminating duplicate content.
What about different sites promoting the same product with the same description from the supplier. In such case you can hardly avoid duplicate content issues.
Well if you want exact details than read the Official Google Webmaster Central Blog - http://googlewebmastercentral.blogspot.com/2006/12/deftly-dealing-with-duplicate-content.html Reading through the blog will answer most of the questions you may have. And at the bottom of the page they have links to more concerns and questions you may have. It is pretty clear on the blog. All their links are Google blogs on just about everything. Here is something else on Dup Content written at Alaxandra -
Then generally the authority site wins .... the rest are filtered down the results. We ALWAYS encourage clients (where possible) to use their own unique descriptions... if it's a feed.. well that can be a drag... Dood.. that useless piece of TRIPE was why I wrote my article.... as with many things Google, it doesn't tell U enough.... READ the entire piece I posted... it is FAR more detailed
Ok last time I checked I am not a dude. Second - I am considering straight from the source. Official Google Webmaster Central Blog - http://googlewebmastercentral.blogspot.com/2006/12/deftly-dealing-with-duplicate-content.html Thirdly - Cheers to you for giving your clients a much more detailed plan. I do realize many many be confused or lost in what Google means by duplicate content.
Well Matt perfected (and tought Adam) the fine art of Google-Speak Circle Talk. So, unfortunately they are not always the best advice... it's partial pictures.... I went through Matt's blog and other sources compiling it... G just doesn't fill in all the blanks .... another reason I am a Patent Hound... it's something to hold onto at least... he he .... and I use dOOd all the time.. U should put yer Beautiful Mug as an Avatar.. it would brighten up the place I'd even sing 4 U -
Ok well all things considered - as soon as I am out of my Avatar contract, I just might possibly flash you a pic. Because things are getting a little hot and heated. You will sing 4 me - . You will have me in your hands like beautiful putty to work with and mold. I will follow Master of all lol - wink, wink
hee hee Today U have touched, more than my SEO passion Woeful is me, selling Avatars is the fashion Maybe some day our pixels might meet Sweeter than chocolate, my what a treat Until that day, these memories shall not go ..but enough of this now, it’s back to S-E-O weeeeeeeeeeeeeeee
dude you need a spidey suit! As for duplicate content... like product descriptions, I need to provide prescription drug information in the interests of consumer safety. The info is published on an Australian government website (it's copyright free), so how do you think google will react if I use that content on my site?