has anyone got any proven/substantiated figures regarding what % is deemed duplicate content? if a site starts with a template for each page...is that a bad start? is the html for the page included in the 'duplication'? ie table id's etc? if you have 10 pages 90% the same...are they all ignored or 9 of them? tia for any light you can shed BTW..in case you need it, this tool is the one i have been using.. http://www.webconfs.com/similar-page-checker.php
As far as I know, same title, h1/body triggers duplicate content. And templated has no effect in duplicating. Its the content that matters.Pm me, If you want to see an example of a page that is on two different sites but being penalized for duplicate contents.
i have a couple amazon sites .... every page is unique ... not one piece of similar data except that html that holds the content. literally ... h1 = book title keywords = book title, author title = book title by author google jsut loves smashing my face in the dirt with that damn filter. i tweak the pages ... then it's good for a while again. there is something mroe to it ... i heard that if you have several similar sites and link between them too much, this will trip the 'similar pages' filter ... but i dunno. i'm currently pulling my hair out over it ... watching weight go zero over night and traffic slowly die.
Template is not an issue. As long as footers etc. are not bulked up with static text. Keep everything around the real body as light as you can. The whole filter shouldn't be an issue to anyone involved with quality original content.
What about affiliate sites that pull product data from a CSV feed, what can you do to minimize the risk of tripping a filter? I'm moving into an area like this and I'd be interested to hear from some old hands!