Duplicate content penatly

Revelations-Decoder Well-Known Member

Messages:: 3,028

Likes Received:: 152

Best Answers:: 4

Trophy Points:: 190

#21

JessRos said: ↑

Is there any way we can understand Google is considering the website as duplicate?
Click to expand...

Try this http://www.copyscape.com/

If they can just imagine what Google can do/know

Revelations-Decoder, Feb 6, 2010 IP

expertofexperts Active Member

Messages:: 1,040

Likes Received:: 3

Best Answers:: 0

Trophy Points:: 80

#22

Revelations-Decoder said: ↑

Uh are you mad Sir? What has that got to do with the conversation?
Click to expand...

Trying to get his head banged to wall and waiting to get banned

expertofexperts, Feb 7, 2010 IP

adproducts Peon

Messages:: 14

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#23

I think the duplicate content penalty of Google is over-rated.

Re the example with EB Games and Gamespot I have seen this type of thing a lot, and it's almost NEVER penalised. Reason - they are established, reputable, large sites.

Where it DOES occur is on blogs, articles etc which have the same articles posted. Generally it's not so much Google penalizing a site specifically for having this type of information, rather it sees the two lots of content as being the same and then just picks one to index. The ones NOT indexed will usually see it as a penalty, more it's just Google recognizing the duplicate content and picking one lot only to index.

Matt

adproducts, Feb 7, 2010 IP

Meglepett Active Member

Messages:: 152

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 53

#24

I'am running a related experiment. Seem like websites with an amount of incoming pagerank has the abount of maximum indexed pages - page rank experiment

Meglepett, Feb 7, 2010 IP

ClubberLang Peon

Messages:: 28

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#25

Meglepett said: ↑

I'am running a related experiment. Seem like websites with an amount of incoming pagerank has the abount of maximum indexed pages - page rank experiment
Click to expand...

That's a nice experiment. Don't forget to let us know if the pages get de-indexed (or lose rank) in a couple weeks.

ClubberLang, Feb 7, 2010 IP

Canonical Well-Known Member

Messages:: 2,223

Likes Received:: 141

Best Answers:: 0

Trophy Points:: 110

#26

Revelations-Decoder said: ↑

Not only that RE the date thing Canonocal - as we all know we can make up the dates as we go along, in the likes of WordPress for example. I think it more likely Google & Googlebot use their own crawled time data than any date thing that appears on a site

LOL I could write an article tomorrow and date it as 1997 on a domain registered today, so as you say "Canonical" the on page date that shows definately can't be the metric involved in any way shape or form.
Click to expand...

I wasn't talking about the date on the site. I AM talking about the date that Google first indexed the content on each of the sites. They are not going to use index date either, because like I said, different sites have different crawl frequencies. Some only get crawled once per month or so, others get crawled once every two weeks or so, some get crawled once a week, others get crawled constantly - 24x7.

So it would be VERY unfair to those sites that only get crawled once per month or once every week or two to use the index date/time to distinquish duplicates. They would never get credit for their content if other sites which get crawled more frequently were copying and republishing it on their sites.j

Here's a video of Cutts talking about it at SMX a couple years ago. He specifically mentions how they have to take crawl rate into account when coming up with a solution for figuring out the original from duplicates to prevent smart blackhats from claiming content from sites that are crawled infrequently. He mentions how dups aren't so much a problem w/ blogs because of pings... but they do w/ traditional web sites. He also mentions adding a link in RSS feeds back to the original version on your site to help them distinguish original from dups.

Last edited: Feb 7, 2010

Canonical, Feb 7, 2010 IP

ashwaria Guest

Messages:: 119

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#27

Yes Google do some type of penalty so for this start writing unique things not duplicate

ashwaria, Feb 7, 2010 IP

Revelations-Decoder Well-Known Member

Messages:: 3,028

Likes Received:: 152

Best Answers:: 4

Trophy Points:: 190

#28

Canonical said: ↑

I wasn't talking about the date on the site. I AM talking about the date that Google first indexed the content on each of the sites. They are not going to use index date either, because like I said, different sites have different crawl frequencies. Some only get crawled once per month or so, others get crawled once every two weeks or so, some get crawled once a week, others get crawled constantly - 24x7.

So it would be VERY unfair to those sites that only get crawled once per month or once every week or two to use the index date/time to distinquish duplicates. They would never get credit for their content if other sites which get crawled more frequently were copying and republishing it on their sites.j

Here's a video of Cutts talking about it at SMX a couple years ago. He specifically mentions how they have to take crawl rate into account when coming up with a solution for figuring out the original from duplicates to prevent smart blackhats from claiming content from sites that are crawled infrequently. He mentions how dups aren't so much a problem w/ blogs because of pings... but they do w/ traditional web sites. He also mentions adding a link in RSS feeds back to the original version on your site to help them distinguish original from dups.
Click to expand...

Wheres the video link Canonical?

Revelations-Decoder, Feb 7, 2010 IP

Canonical Well-Known Member

Messages:: 2,223

Likes Received:: 141

Best Answers:: 0

Trophy Points:: 110

#29

Revelations-Decoder said: ↑

Wheres the video link Canonical?
Click to expand...

Sorry I forgot the link. At 2 min in he specifically mentions how they have to consider crawl rates... at about 2:45ish he mentions putting links back to your copy of the content in RSS feeds to help them know who the originator is.

Of course, it's 2 years old. Lot's can change between then and now. However, it shows that they don't like to implement "checks" for things that can affect rankings (like duplicate content) unless it can be done in a way that is fair and not spammable.

And since using the first copy crawled as the originator would be totally unfair to sites that get crawled infrequently, it's very unlikely that is the determining factor.

Canonical, Feb 8, 2010 IP

Log in or Sign up

Duplicate content penatly

Revelations-Decoder Well-Known Member

expertofexperts Active Member

adproducts Peon

Meglepett Active Member

ClubberLang Peon

Canonical Well-Known Member

ashwaria Guest

Revelations-Decoder Well-Known Member

Canonical Well-Known Member

Useful Searches