Photographer, did a promo by giving away free websites. All the websites were the same template, but contained different text (about the person) and different images (which google can’t see of course). 99% sure getting hit with the Dup content filter. Our solution: - Rename all websites link structures so they are unique. - Change & Reword all text on all pages to make double sure no text is duplicate, however many of the same words will have to be used but in different order etc. - Change Names of .css files & Flash files. - Change titles so more then just the persons name is changed - Unique titles - Same as above but for keyword & description tag... Stuff and gumble up some info in those so it is unique for each site. The Big Question: Does google’s duplicate content filters detect website layout? Eg. Tables, backgrounds, and many of the layout “static†images, like lines, spacer boxes used etc. I know this sounds far fetched, In my mind the computing power needed would be just enormous, but just wanted to hear your guys thoughts on this, google seems to be getting pretty ridiculously picky. My best guess is all they are doing is taking some sample of data, assigning some checksum value to it or whatever, then comparing the values. If 2 values come back the same it deems it duplicate content. Does this sound right? Using this thinking, they must have taken a sample from some generic place in the pages that was the same. Should our solution do the trick? Any other suggestions / thoughts? Love to hear from you guys with some experience in this. Also is there a penalty for “Duplicate content� Is the penalty just simply links not counted or is it more serious? After everything is changed will the sites just pop back into the search term rankings and things go as normal? I noticed this happen a few times before when changing domain names, publishing the same content to a different domain. Google dropped both domains from ranking for any important search terms, when we redirected the old domain the other one poped right back into the index within days. Ps. All the sites & pages are still in the index if we lookup site:thedomains.com Cheers
If they're using checksum value, minor changes would give massively varying values. Secondly, if Google tracked link structures, imagine all those people who're using default WordPress setups syndicating articles (scraped articles still rank). Some argue that fingerprinting doesn't exist (this is in the context of massively automated sites, a lot of people seem to suggest they fingerprint mostly template + directory structure). From my experience, however, when you have duplicate content, your site isn't penalized per se, it'll just show up lower in the results than the page that Google believes to have the originating article. If all your sites are using the same template, there's nothing to worry about (otherwise all those subsilver forums would be long penalized ). The issue only only comes about when the majority of your text content is duplicated, and even then it'll only rank lower in terms that would've NORMALLY brought up your page (the original will be what ranks). Basically, look at the identifying aspects that Google has to go by and try to vary them between sites if your only content is an image (but remember you can use alt tags to differentiate pages). I havn't seen any evidence where a domain was actually penalized on all counts in the SERPs due to duplicate content. Classic example of people abusing this? http://www.threadwatch.org/node/4091 If anyone has anything to prove otherwise, I'm all ears!
Thats why I was figuring on a sample of data taken from the page. Your link provided is kind of case in point. It was not an exact copy of the webpage, but still got snagged somehow. --- In our case For example the contact page on all the sites was the exact same except, Different Name & Phone number. And only the name & sitename was changed in the title, keyword & description tags. But there are actually many sentences that are the exact same, in some cases even a few of the small paragraphs were the exact same. That’s what I thought, just make a few minor changes (which there were naturally) on every site and things would be ok. But surprise surprise. Yeah, I don't have any idea. Gonna gumble everything up real good and see what happens. Major pain in the arse tho. Any thoughts and experience much appreciated.
I have the same problem with a duplicate site and am beginning to change content, titles, meta tags, etc. How long does it take to get that filter lifted to rank better? Do I really need to change the content on every page or just the pages I want to rank for?
Does anyone know if having duplicate content affects pagerank? I found someone selling a link on a supposed pr 9 page which is a mirror for an authority site. Checking the backlinks for that page show that the backlinks all point to the corresponding page on the expert site and there are no backlinks that point to that page which is pretty strange. Seems like they may have had a 302 redirect to intentionally or unintentionally hijack the pagerank from the expert site. Running it through the fake pagerank detection tool reports a fake pagerank and shows that the pagerank belongs to the expert site which makes it even more suspicous. Is it possible for a mirror site to have a pr 9 like that? This is the page: http://osmirror.com/licenses/afl-3.0.php so you can check it's google backlinks and see that there are in fact no links pointing to that page but there are links pointing to the expert site: http://www.opensource.org/licenses/afl-3.0.php.