Hi all... I was searching for some information about a specific area in France and I found 10's of sites with the same GENERAL INFORMATION. When I take a paragraph and paste it into the Google search bar (with " ") - I get many other with the same text! Isn't it strange that those site are doing that? People are not afraid to get cought? But what most make me wonder- is google doesn't do anything about it ? How can many pages has the same content? Cdx.
Usually, as long as the entire content and design is not copied verbatim, Google doesn't notice (yet), so many sites put a template around the content and maybe mess with it a little. If you looked at the source, they may be slightly different at that level, but that's the level that search engines are working at so a slight difference = different content. Google may get better at recognizing dup content soon though, so they may be taking their chances. Now what about this: Should Google be penalizing sites that mirror Public Domain / Open content like DMOZ and Wikipedia? It would seem like a unethical choice, since the content is free and someone could be using it for legitimate purposes...
number of times the news articles in a lot of newspapers are the same as they are from the same source (eg. reuters). So google cannot as such directly punish the websites. Also people quote articles and news items in forums .. and blogs. Its not so easy for an automated program to detect and penalise.
Since DMOZ make their directory available for anyone to use (including Google BTW) I really can't see them penalizing for it. Google has recently patented a system for detecting duplicate content, including verbatim content that is mixed with other content, so I would be careful about plagarization, but that said it seems to me that Google doen't necessarily add all these bells and whistles to the Algo, they just run one as a standalone program every now and then when they see a problem that justifies it.
As some of you know I recently wrote a two part article about PageRank. It was picked up and published by a number of sites and newsletters. In each case, that I am aware, they gave me full credit for the article. In turn if I find what I think is a good article written by someone else I'll publish it in my InfoPool. Again giving the author full credit. This isn't plagarism. But it does result in duplicate content on the web. I don't see any problem with that.
IMHO it is normal to copy but you have to give the author full credits and put the link of the original source.