Duplicate Content Criteria

Discussion in 'Search Engine Optimization' started by softstor, Jun 22, 2006.

  1. #1
    What criteria does google use to determine if a webpage has duplicate content from another site?

    Does google use something similiar to Copyscape?
     
    softstor, Jun 22, 2006 IP
  2. seoindiaweb

    seoindiaweb Banned

    Messages:
    2,889
    Likes Received:
    200
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Well Softstor as far as i heard they have their own technology to see if any site is using duplicated content..
    but, its not sure how much of % is counted as Duplicated,
    i personally think if its >50% .. you will be penalize for duplicated content... it was somehwere on seomoz.com but dont remember the link sorry
     
    seoindiaweb, Jun 22, 2006 IP
  3. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #3
    My own personal opinion is it has to be downright identical for it to be penalized HOWEVER the 50% rule is a good one to follow.

    Dave
     
    CrankyDave, Jun 22, 2006 IP
  4. proudlyPinoy

    proudlyPinoy Peon

    Messages:
    78
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    I personally do not believe that google penalizes duplicate content. Speaking from experience as a searcher though. I've been seeing lots of duplicate content out there.
     
    proudlyPinoy, Jun 22, 2006 IP
  5. softstor

    softstor Active Member

    Messages:
    782
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    60
    #5
    I have noticed if you copy a few sentences from one site, Copyscape will pick this up. How sensitive is google?
     
    softstor, Jun 22, 2006 IP
  6. Web Gazelle

    Web Gazelle Well-Known Member

    Messages:
    3,590
    Likes Received:
    259
    Best Answers:
    0
    Trophy Points:
    155
    #6
    I know that Google does penalize sites for having duplicate content, I have seen it. I can't post any links but I do know that duplicate content can get a site knocked out of Google.
     
    Web Gazelle, Jun 22, 2006 IP
  7. proudlyPinoy

    proudlyPinoy Peon

    Messages:
    78
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    this might not be a good example but how about lyric sites? don't they basically have the same content? again, this might not really be a good example but it shows my point.
     
    proudlyPinoy, Jun 22, 2006 IP
  8. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Although Google can penalize duplicate content, from what I've seen it has to be darn near identical.

    Look at how many scraper sites contain nothing but duplicate content, indentical word for word from the original that gets and remains indexed.

    Dave
     
    CrankyDave, Jun 22, 2006 IP
  9. Web Gazelle

    Web Gazelle Well-Known Member

    Messages:
    3,590
    Likes Received:
    259
    Best Answers:
    0
    Trophy Points:
    155
    #9
    Duplicate content will not get you banned. You just won't show for more competitive search terms.
     
    Web Gazelle, Jun 23, 2006 IP
  10. wibr

    wibr Peon

    Messages:
    206
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    0
    #10
    If anyone can actually figure out, and prove, what the criteria is for duplicate content in google I'll send you my next *affiliate check.



    Footnote
    *don't expect to retire
    *you might be able to go to the movies or something though
     
    wibr, Jun 23, 2006 IP
  11. sholiz

    sholiz Active Member

    Messages:
    495
    Likes Received:
    22
    Best Answers:
    0
    Trophy Points:
    60
    #11
    I'm curious about this too, I'm opening an article website and some of my content will be coming from my forum. I wonder how they'd see this ...
     
    sholiz, Jun 23, 2006 IP
  12. Web Gazelle

    Web Gazelle Well-Known Member

    Messages:
    3,590
    Likes Received:
    259
    Best Answers:
    0
    Trophy Points:
    155
    #12
    Just make everything that you have control over, original.
     
    Web Gazelle, Jun 23, 2006 IP
  13. redhits

    redhits Notable Member

    Messages:
    3,023
    Likes Received:
    277
    Best Answers:
    0
    Trophy Points:
    255
    #13
    I think that they are looking for phrases similarities :) hey just search google for a normal phrase, only 4-5 words, and you will see how little results you will get!

    For example search for "I want to leave New York"

    http://www.google.com/search?source...:2006-11,GGGL:en&q="I+want+to+leave+New+York"


    see it?! got it?!


    I think they are able by doing what in programming it's called backtracking to detect any duplicated content on the web ! just creating some 4-5 keywords phrases!

    G is very smart! and can also detect your website "template" from your website page. So it will only read & count your original page content, not the text &menus (like a news box) site-wides ... everybody saw that?!


    Now, here is a trick to not let google think your website have dublicated content.

    If you have 1000 pages, then all keywords like : word1,word2,word3,word4 and then word2,word3,word4,word5 must to be compared. If on the internet are let's say 300 billions indexed pages then google will must to do
    If you got 300 word on a page, and you take word1,word2,word3,word4 and then word2,word3,word4,word5 , etc , there are 300 * 4 = 1200 arangaments
    300 billions * 1000 * 1200 = quite a little more calculations :)
    (i also find the thing with pages on cache, and pages without on cache also resolv because with the problem of so many calculations)
    I saw that google is using a quick trick for not wasting to much time with processing data. It's really looking at website title + number of links!
    If you got the same title in your website + the same number of outgoing links! it will usualy mark it as dubilicated content ! if you write an 300kb uniq content, it will think it's duplicated!
     
    redhits, Jun 23, 2006 IP