I know that duplicate content pages get "flagged" and penalized. But how Google (or other engines) determine which is the duplicate, and which is the original ?? The page with highest pagerank? or the page with older date? anyone know more info about this ? thanks
this is a timeless mis-conception. Dupes are not necesarily a penalty, they are generally a filter. For the most part the ranking 'authority' site with the content wins the day. Information on Duplicate content SNIPPET ( I can't post the whole thing to avoid - U guessed it - duplicate content)
I agree. Almost every second website has at least some content that is identical to other's, can be entire paragraphs or short phrases.
yes it is a panelty as one of my friend's website got banned from google as he was using duplicate content. Content upto some percentage is tolerable but beyond that it's not at all tolerable. Plus your content also matters.
I see, he got an email from Google saying WE ARE BANNING YOUR FOR DUPLICATE CONTENT?? doubt it Heresay on your part. NO ONE has EVER been banned for dupes alone.. think about it.. it happens all the time... YOur friend was up to other things I am afraid and has used it as a scape goat
google has a lot of algorihtm for this but i think usually high pr sites dont get duplicate content flag.
I think Duplicate content is a very real problem that most website owners don't understand. It happens at an internal level, and an external level. Internally, you might have 50% of your inbound links going to www.example.com/ - 30% going to www.example.com/index.html and 20% going to example.com/ Google sees all 3 as different pages, but notices the duplicate content so it filters the 2 least powerful versions from the SERPs. The result? only 50% of your links are counting! The smart SEO understands duplicate content and uses 301 redirects to focus 100% link power into their preferred URL - normally www.example.com/ I have brought websites from top 50 to top 10 just by fixing this one problem, so I beg to differ with anyone who says this is not important. The sites I'm referring to had up to 15 different URLs for the homepage, based on a pooly written CMS, multiple domain names and bad navigation. Duplicate content between you and other sites? Another story, but suffice to say I wouldn't want more than 10% of my content to be the same as that found on other sites. Site with 90% duplicate content and very little authority? Pretty good chance it's a scraper, or MFA, or something else undesirable.