Duplicate content

Discussion in 'Google' started by Dimasik, Dec 18, 2007.

  1. #1
    Hi all!
    Does anybody know how Google finds duplicate content?

    For example, I have the same two articles. Google find it duplicate content. How many % of article text I have to change to become them not duplicate?
    I'm interesting in Google algorithm of finding duplicate content.
    Where can I read about it?
     
    Dimasik, Dec 18, 2007 IP
  2. grg

    grg Guest

    Messages:
    2,692
    Likes Received:
    73
    Best Answers:
    0
    Trophy Points:
    0
    #2
    It just compares it, word by word. How much to change? I doubt that anybody can answer...
     
    grg, Dec 18, 2007 IP
  3. jjpmarketing

    jjpmarketing Peon

    Messages:
    733
    Likes Received:
    27
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Just like everything google does, I am sure there is an algorithm is involved. Will it hurt your efforts having duplicate content? Probably... but not for just 2 articles. All you would need to do is just re-word the document. Make it seem like a different writer wrote the article. Make the articles different lengths. If one is 1000 words, then make the other 1200 words. If you add to it and re-word some of it, it should be unique enough that it isn't considered duplicate content.

    But this is just me speculating.
     
    jjpmarketing, Dec 18, 2007 IP
  4. Dimasik

    Dimasik Peon

    Messages:
    5
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    I don't think it's posible to compare word by word millions of pages every day.
    I know, to make a new article is easy. You have to add some text, use some synonym, change some words.

    But does anybody know how many changes you have got to do to make it unique? I think, there's some% of changes in text.
    And how can I check duplicate content? Only in supplemental index or there's any other way?
     
    Dimasik, Dec 18, 2007 IP
  5. eyeflare

    eyeflare Peon

    Messages:
    62
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #5
    My take is that as long as it passes a Copyscape check, you've probably redacted enough for avoiding a duplicate hit. Anything less than that... who knows?
     
    eyeflare, Dec 18, 2007 IP
  6. grg

    grg Guest

    Messages:
    2,692
    Likes Received:
    73
    Best Answers:
    0
    Trophy Points:
    0
    #6

    So what do you think they do every day, just counting incomes? :)


    I say - there is possible to have one or two sentences same like the other one - but the more, possibility is less and then it can start to be suspicious.

    The second thing is that, they have probably quite sophisticated alghoritms, to filtering and comparing data with multiple passes etc... but no one knows that, so it's not easy to answer. If you like to know, start couple of blogs and check it by yourself.
     
    grg, Dec 18, 2007 IP
  7. Candle Making

    Candle Making Peon

    Messages:
    38
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Sorry if it's a stupid question but what's a Copyscape check?
     
    Candle Making, Dec 18, 2007 IP
  8. eyeflare

    eyeflare Peon

    Messages:
    62
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Take a look at www.copyscape.com
     
    eyeflare, Dec 18, 2007 IP
  9. mojtata

    mojtata Well-Known Member

    Messages:
    722
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    110
    #9
    I think 30-40% of uniques on dupecop.com is enought.
    It can compare uniques of article a to article b
     
    mojtata, Dec 18, 2007 IP
  10. codeber

    codeber Peon

    Messages:
    578
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Its worth trying if you are worried.

    its better being overly cautious than less.
     
    codeber, Dec 18, 2007 IP
  11. WilsonA

    WilsonA Active Member

    Messages:
    847
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    60
    #11
    normally google would not penalize you for duplicate content if you link
    back to the original one if there is any
    but if you are trying to change do it to about 25-30% and you should be
    ok
     
    WilsonA, Dec 18, 2007 IP
  12. Candle Making

    Candle Making Peon

    Messages:
    38
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Thanks for the link eyeflare. I'm pretty new to this whole website promoting thing but I've learned quite a lot on this site already. Thanks :)
     
    Candle Making, Dec 18, 2007 IP
  13. itliberty

    itliberty Peon

    Messages:
    1,173
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #13
    are both sites on the same server? Google might find them dups because they reside on the same IP. I also think it depends on the keyword and the phrases leading up to it and after it for google to call it duplicate with their algorithm.

    So my suggestion would be to analyze your keywords that you are wanting to be indexed by and only worry about changing those sentences.

    Copyscape, although great, will drive you nuts if you try to get a "no results found" pass from them as they will tell you about sites that only match once sentence completely. So you would have to change the whole article to make it appear as "no results found"..

    Take care!
     
    itliberty, Dec 18, 2007 IP
  14. master_06

    master_06 Peon

    Messages:
    289
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Google is using a tool for knowing duplicate contents. try this one as an example.... duplicate content checker tool.

    I have used it and found it very useful in knowing if someone else duped your content or if you have duped your own.

    reminder:

    The tool is not always that accurate but nevertheless, it will give you some results on your query.
     
    master_06, Dec 18, 2007 IP
  15. corlock

    corlock Banned

    Messages:
    538
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    0
    #15
    definitely, when the content is purely identical...even you can figure it out...
     
    corlock, Dec 18, 2007 IP