Where is the threshold for content duplication?

Discussion in 'Search Engine Optimization' started by Alison, Dec 2, 2006.

  1. #1
    Does anyone know how much content duplication is allowed by Google, in terms of word frequency?

    I use Similar page checker to establish the % of similarity, but there are no data on how they obtain the value.

    I have a page generator and I need to adjust it for optimal results.
     
    Alison, Dec 2, 2006 IP
  2. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #2
    as a rule of thumb:

    when you ask yourself such question, or
    when you have to calculate the content of your content then most likely you have NO self created unique content

    in ANY self created unique content there never is any need to count or verify the keywords or similarity - human creativity assures all is different each time you publish a new page.

    be creative - use and deploy your God given creative potential - there are billions of unique pages than can be written each day to meet the information hunger and information NEED of all globe
     
    hans, Dec 2, 2006 IP
  3. KC TAN

    KC TAN Well-Known Member

    Messages:
    4,792
    Likes Received:
    353
    Best Answers:
    0
    Trophy Points:
    155
    #3
    Nobody can answer your question except the Google guys.. Do not think in the context of Search Engines, just make sure your users do not have that 'duplicate content' feeling when they are browsing your site.
     
    KC TAN, Dec 2, 2006 IP
  4. Sprouter

    Sprouter Well-Known Member

    Messages:
    426
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    130
    #4
    I thought it was 70%
     
    Sprouter, Dec 2, 2006 IP
  5. blazinCrazy

    blazinCrazy Peon

    Messages:
    237
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    0
    #5
    70% of the content has to be different or 70% can be the same?

    I remember reading something over at seochat.com about the threshold for duplicate content being very low and that you only need 30% of original stuff. However I don't think it's as cut and dry as a percentage.

    I think google looks at many factors such as link popularity, grammer, html structure, outgoing links etc.
     
    blazinCrazy, Dec 2, 2006 IP
  6. hazelj80

    hazelj80 Guest

    Messages:
    134
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #6
    they do.... most likely but know one knows what each of the factors way per each of those factors you mentioned.

    most people seem to think it's all based on text only.
     
    hazelj80, Dec 2, 2006 IP
  7. xc06

    xc06 Notable Member

    Messages:
    3,498
    Likes Received:
    332
    Best Answers:
    0
    Trophy Points:
    203
    #7
    This is a hard research problem. I study statistics I know it. Who knows google algo?
     
    xc06, Dec 2, 2006 IP
  8. vitaminp

    vitaminp Peon

    Messages:
    202
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Are you implying that images and layout can be classed as duplicate content?

    I can sort of understand images - as google doesnt want to index layout-related images such as logos and menus... Although could this apply to a gallery, for example, where thumbnails and images are repeated throughout a site?
     
    vitaminp, Dec 3, 2006 IP
  9. solidghost

    solidghost Peon

    Messages:
    413
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #9
    What does Google compare your content with anyways?
    There are so much information out there on the web and google probably have indexed billions of pages.
     
    solidghost, Dec 3, 2006 IP
  10. Nick_Mayhem

    Nick_Mayhem Notable Member

    Messages:
    3,486
    Likes Received:
    338
    Best Answers:
    0
    Trophy Points:
    290
    #10
    Try copyscape.com on other sites and then on your own site. :D

    You will know what to do and what not to do.
     
    Nick_Mayhem, Dec 3, 2006 IP
  11. thegypsy

    thegypsy Peon

    Messages:
    1,348
    Likes Received:
    109
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Hey troops…thought I’d chime in.

    To start with Duplicate content issues are a FILTER per se not a penalty. So having dupe content is not necessarily going to tank you. You may not rank for the page in question, but that’s another story.
    Now, if your site meets other criteria ( dupe content aggregate, spammy techniques etc..) then it could certainly become a penalty, but without a large degree of dups on your site (aggregate) there is not a lot to worry about really.

    The other topic mentioned here relating to page structure/layouts is ‘page segmentation’ aspects of the algo. They most certainly understand page segmentation and it can be called into play during the retrieval stages. While not entirely constructed for dupe detection (can U say editorial links?) it certainly has the potential to be inclusive of those operations
     
    thegypsy, Dec 3, 2006 IP