In the google search for different drugs, I come across many top ranking sites having the same content, word for word. The drug inmformation given in the national institute of health is replicated in sites like mayoclinic.com, exactlty. But, all these sites come at the top for any drug search. where do they stand in the google content duplicacy framework?
Google can't detect exactly duplicated pages. Some little differences can make a dup to be OK for the spider's eyes
Google are pretty good at detecting portions of copied content within pages. At the end of the day it depends on the trustrank of the site, if its a respected site with dup content then it will still rank as its a useful result.
Google uses DMOZ listings in their own directory word for word along with the same category structuring. It doesn't make sense that they penalize or filter out some other DMOZ clones, but their directory pages rank rather well, sometimes above the DMOZ page for a category. It's like they pick and choose which sites get filtered based on the popularity, age of domain, other content etc.