According to the SEO tutor I just read, that is the case, yes. My biggest problem is my site is a repository of information, therefore 90% of my information is duplicate. I must be getting penalized for this somehow.
If you quote a part of the info and then provide a link to the actual site, you should be fine. Just add a few lines about why you think the article is helpful or important and link to them...the best way to do it... That';s what I do with my sites when I need to put up info available from other sites. (unless I have something exceptional to edit or add or to critic it).
I don't believe that Google is very efficient to find and penalise duplicate contents. In my practical experience it depends on more on your value in google's algo and your backlinks quality. If you score good with google in term of confidence then your duplicate content will rank higher than original content holder website. Technically its not very easy to check duplicate contents. Google always says that dynamic pages are indexed easily as static pages but practical experience shows its not true. Again its all about your value in google. How much google has confidence in your site.
This discussion has been going on and on and on and on.... The truth is that with so many aggregators around the net nowdays, it will be very difficult and resource demanding to discover what is duplicate content and what is aggregated content. I think this has been made into much of a bigger deal than it really is. I believe there is no direct penalty for rehashed materials, however, basing a site on none original content will not get you ranked as high anyway.
I dont think Goolge penalizes sites for common content. I have a bunch of article sites. There is hardly any orginal content on them. In fact two of them have the same content and are on the same server. Still all of them get SE traffic. However, I feel it is difficult to rank well with duplicate content than with original content.
The easiest duplication for Google to spot is when two pages are byte-for-byte identical, and the same length. If you have two URLs that generate the exact same HTML code, then there's no mistaking that as duplication. I used to have a test site which was a mirror of my main site, but I've since taken that down. Cryo.
Lyrics sites don't get around it, its just that some lyric sites are more powerful then others. If a site is more powerful, then the "original content" will become their own, not the originator. Yes, but what if you had unique content. Of course you get traffic from Y!/G/MSN, but if you had unique content to the web...your traffic might double...or triple...because of the uniqueness.
I think to get penalised for dupe content the entire page has to be very similar. There are too many situations where the 'majority' of the content on a page would be similar for this to happen. Imagine a two sites where the paragraphs, titles, metas, menus and navigation are all the same. This I'm sure would get a duplicate penalty. In the second instance everything is different apart from the bulk of the paragraph. I don't believe this would be penalised. Aprt from lyrics and articles sites, consider RSS syndication. The whole concept would be flawed if any pages delivering RSS were penalised.
I agree that the dup content penalty is more of a myth than actuality. Here's why. When content is copied from one site to another, the original content has been around for a while and probably has already been indexed and ranked. It's older and more established in the rankings than the new copied content. If the search engine sees both pages as being identical in a particular search on a particular keyword, the tiebreaker would logically be the content that has been indexed longer -- not so much a penalty but a natural and logical choice.
I would think it would take google a lot of resources to check copy against copy...thats why I would agree with dup content is a myth theory.
If duplicate content can't be discovered, then everything would be unique. Considering the amount of pages indexed, it would be a huge load. But, I'm sure there is at least a small effort to recognize duplicate content. Now what they do when something is discovered, is what matters. Ever see the "omitted/similar results" message on google search results? I imagine it would have to be a dead smack photocopy in order to drop (penalize) one or the other, otherwise, they may be accidently, automatically penalizing press, news, feeds, and other general public informational sites. My 2 cents