This isn't exactly what I'm doing, but it is an almost identical example. Let's say there is a giant database of baseball players. I am making a site that will give each player in the database its own page (player-name.html). Since there are 1000's of players, this means there will be 1000's of pages. Each page will have this title: Player's Name baseball stats: hits, home runs, stolen bases "Player's Name" will obviously be THAT specific player's name, however, the rest of the title will be exactly the same for all of the 1000's of pages. (Meta tags will be the same sort of way.) The content of each page will consist mostly of a table listing all of the same stat categories in the same order (hits, home runs, etc.) on all of the 1000's of pages, along with that player's stats (Hits: 700, Home Runs: 80, etc.) Each page will have a heading of "Player's Name Baseball Stats." There will also be a few paragraphs of text that will put that player and his stats into words. For example: "Player X hit X home runs over the course of his X year career. He had X stolen bases, and made the all star team X times." All of the pages would use the exact same text, only that player's name and stats would be inserted in for all of the X's. My question is, while a human can tell that this site is NOT duplicate content, how exactly would Google view it? And, if this would appear to be duplicate content to Google, A) What would happen? (site would be banned/penalized, pages seen as dups just won't be indexed?) and B) Is there anything I can do to have Google NOT see this site as duplicate content? Thanks.
Good question. I don't think it would hurt you much, if any. There are other sites out there with the same type of format and I'm sure they have good SERPs. Are you going to write comments or little stories about many of the players?
Thanks for the reply. Nope, I won't be. First and foremost, this site is really just a searchable database of player's stats. Second, if I was actually using a baseball player database, then yeah, I would definitely put some kind of bio of each player on their page to make each page much more unique, but like I mentioned, this whole baseball thing is just a close example of what I'll actually be doing. My database is of something a little different.
porn stars eh? Actually, I was thinking the little bio on whatever you're doing could help keep it unique, if it's a good option.
i dont think that will have a negative effect for your serp . i'm also doing the same think and those pages are getting good visitors
Duplicate content has more to do with content not being unique across completely separate domains. On one site, Google can determine template parts and content parts, especially with such a large base (1,000+ pages) to work with.