1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Why Article Spinning Doesn't Work (aka How Google is Structured)

Discussion in 'Copywriting' started by mariobaez, Jun 3, 2010.

  1. #1
    You can PM me since I will write an article explaining this, but I am here to set the record straight that although you can get some keyword juice by duplicating your articles using automated word-mutating efforts, Google is already ahead of the game.

    As their local engines and Caffeine process becomes more complex, article spinning methods will not only be meaningless, but could put you on some not-yet created blacklist that could hurt your sites for years to come. Let me tell you how (I believe, based on my own research and internship with an education database company) their finding duplicates work.


    1. Algorithm to find duplicates

    No one knows it, or else people would be able to bypass it all the time. Some people do, but there is very little science to it. Google has multiple measurements to catch you. Here are some of the parts.

    2. Keyword/phrase point system

    So let's say you reduplicate your article with changes in words or sentence structure. Look at these sentences:

    * I'm going to the Los Angeles bar to get their delicious cheap chicken wings.
    * I'm leaving because there are inexpensive hot wings at the Los Angeles Bar

    So you are trying to link the keywords "Los Angeles Bar" to your site, and hide the fact the content is spun. Google will look at the keywords and assigns a "point" system to see if the content is duplicated:

    I'm going = idiom of movement = 2 points
    I'm leaving = same = 2 points
    Los Angeles Bar = keyword phrase specific = 50 points
    chicken wings = phrase = 9 points
    hot wings = phrase = 9 points

    It will add up the points, not the keyword, in the article. The closer the points, the more likely it was duplicated. The lesser the points, the more likely it is an anomaly or "organic". But how does Google give points to keywords?

    3. Keyword Meaning System

    Google doesn't give a flying F about keywords. All it does is take the real data, give it value, translate the data into semantic meanings, give those meanings value, and do a bunch of nerd stuff to see what the data stands for.

    Take the words "Tiger Woods", "Chicken Wings" and the letter "r". Let's give some definitions and relations to these words:

    Tiger Woods - golf, superstar, sports, espn, cheater, lion, forest, champion, masters
    chicken wings - food, dinner, spicy, hot, restaurant, 10 cent, chicken, wings
    r - letter, alphabet, ???

    As you can see, Google gives keywords their own "meaning" based on other keywords. This could be automated based on other content where they take words that keep popping up and add it to the keyword. Or it can be semi-automated where they add certain keywords they know will make formulas work.

    Google automates a point system for the words, and the keyword has an absolute point status. Proper nouns point statuses are pretty apparent since the word itself cannot be spun ( Tiger Woods is Tiger Woods ). Letters like r are so ambiguous that when given keywords, it will probably look senseless ( since letters occur so often, or gibberish (like hssdknf) occurs so few times ). So that is why Google probably gives a default point status for those anomalies.

    But keyword phrases like chicken wings could have the same point status as its synonyms:

    chicken wings - food(2), dinner(2), spicy(5), hot(4), restaurant(7), 10 cent(7), chicken(3), wings(3)
    hot wings - food(2), dinner(2), spicy(5), restaurant(7), 10 cent(7), chicken(3), wings(3)

    As you can see, given semantic keywords a specific amount of points, and you will come up with similar point totals. I am sure it is way more complex, but they will pretty much add up an article "point status", not their keywords, and give a total from that point.

    Also, they might map out the points of each word on a matrix:

    30 4 6 7 8 9 10 55 10 5

    20 4 9 8 9 20 33 99 3 5

    30 4 6 7 22 54 6 88 3 8

    And based on the structure, they can determine if your content is duplicated based on algorithms to compare one mapped data with another.


    4. Non-Semantic Sentence Logic

    Since two people can simultaneously create original data bashing tiger Woods, the best way to make sure that the spun content is duplicated is sentence logic. They take out the proper nouns and other keyword phrases that happen frequently, and asks the algorithm:

    "Hey, in these two data sets, how closely related is the logic?"

    This is for the non-important keywords ( such as prepositions, overused adjectives etc. ). They count maybe how many times they are used, how many times one non-keyword is close to other (such as finding "in the" and "over the") etc. This is where the boundaries are endless, but you can tell how close the syntax is since most people don't write the same way.

    Google doesn't penalize as much as they use to because of syndication and blog referencing. But once rel canonicals catches on, or they test it enough to know it can be a powerful tool for their indexing, it may make catching spun articles even easier.


    Just my two cents. Any thoughts? I plan to write this as a big article with some graphic examples.
     
    mariobaez, Jun 3, 2010 IP
  2. dyadvisor

    dyadvisor Peon

    Messages:
    693
    Likes Received:
    19
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Mario: This is a very worthy posting

    Closely checking your logic – it sure matches mine. Plus, you did a superior job of explaining it.

    I highly recommend to others, that plan to be around, to copy this post to notepad. As it not only applies to articles, but blogs, home pages, and virtually all non video content.

    Just want to add one point. Spinners fool Copyscape, but not Google. In fact my list of 650+ Google stop words shows that Google is aware of every substitution word that spinning programs use. Example: about = most = almost = nearly. All spin words that change, all stop words that Google ignores.

    Thanks for a very in-depth study, that I hope enough people realize the value of. --------I do------------
     
    dyadvisor, Jun 3, 2010 IP
  3. Perry Rose

    Perry Rose Peon

    Messages:
    3,799
    Likes Received:
    94
    Best Answers:
    0
    Trophy Points:
    0
    #3
    *rolls eyes*

    You don't even know what you are talking about.


    mariobaez, you are talking about this point system. ... Which Google source is this from?

    No offense, but your article needs a lot of work.

    For starters, it is too technical, even though it is not. For many, it LOOKS technical, which will have many logging off.

    It really does not need to the be said anyway.

    You left out the fact that spinners cannot write out more than 2 GOOD articles.

    Or did you not know this?

    And even then, that second article either doesn't make much sense, or it is below par.

    Tell your readers to use this tool to see for themselves: http://www.articlequeen.com/

    Tell your readers that they are better off writing a fresh article by looking at and rewriting the original article, which takes less than an hour anyway.

    They can do, say, a couple a day, everyday.

    Even a piss-poor writer can do this.

    That's the bottom line, and, in a nutshell.

    On and on....

    Anyway, since I despise article spinnners because they are such a joke, and so many are getting ripped off, I wish you luck with it.
     
    Last edited: Jun 3, 2010
    Perry Rose, Jun 3, 2010 IP
  4. mariobaez

    mariobaez Guest

    Messages:
    62
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    You are right. Most of my points are not based on anything Google has let known. But I have been part of a database program that translates foreign documents and journals for English use, and the programs find idioms by giving words a specific letter number combo. This allows that idioms that are specific have their own, while other ambiguous words have the same number combination.

    I was only stating a hypothesis, that I feel GOOG pays more attention to the a secondary meaning of a keyword, and not the word itself.

    I didn't even mention that I was helping article spinners. Thanks for the criticism, but you sound a little grumpy:(
     
    mariobaez, Jun 3, 2010 IP
  5. phpwnes

    phpwnes Peon

    Messages:
    509
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    0
    #5
    it all depends how you spin if you just spin and change synonyms then yes google might catch up with that but if you change whole sentences and their composition making them atleast more than 70% unique i doubt that google can do anything about that.
     
    phpwnes, Jun 3, 2010 IP
  6. Perry Rose

    Perry Rose Peon

    Messages:
    3,799
    Likes Received:
    94
    Best Answers:
    0
    Trophy Points:
    0
    #6
    lol no, not at all.

    Right about now...hungry, actually.

    Time for McDonalds!


    phpwnes, that's just it, spinners cannot do that and put out a good article(s) at the same time.



    I hear a Big Mac calling my name.
     
    Perry Rose, Jun 3, 2010 IP
  7. mirisaamali

    mirisaamali Well-Known Member

    Messages:
    812
    Likes Received:
    31
    Best Answers:
    0
    Trophy Points:
    118
    #7
    Well guys, one thing i have learnt in every business is that "Dishonesty sure fails one day". Its better to put more efforts in going to regular allowed path than go other ways to make quick money. If you see the end result, its actually a huge loss to us.
     
    mirisaamali, Jun 3, 2010 IP
  8. SwimFinn

    SwimFinn Peon

    Messages:
    123
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #8
    It was looking as if you were using string and database language methods, which are probably part of the simpler analysis being done. You didn't mention the multidimensional comparisons which are undoubtedly being done, nor performing entirely different reanalysis of possible matches. They're not going to throw a lot of horsepower at everything, but once a possible match exists they can devote more resources to examining possible matches. The reanalysis could involve anything, including computational linguistic techniques of arbitrary complexity.

    I assume the guys at Google know what they're doing better than I do, and they might do analysis which is at least ten times better than the most complex that I can do, and that they have several fast screens ahead of that to find things worth examining in detail.

    What works best is to write for humans -- of course, here we're talking about writing for humans a description of what Google might be doing.
     
    SwimFinn, Jun 3, 2010 IP
  9. SwimFinn

    SwimFinn Peon

    Messages:
    123
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #9
    We don't know. If they run sentences through a parser like this (and this is a simple parser) then they can do further analysis to try to identify meaning, or to identify what to ignore.

    (NP it) (NP all) (VP depends) (ADVP how) (NP you) (VP spin) (SBAR if) (NP you) (ADVP just) (NP spin) and (VP change) (NP synonyms) then yes (NP google) (VP might catch) up (PP with) (NP that) but (SBAR if) (NP you) (VP change) (NP whole sentences) and their composition (VP making) (NP them) (NP atleast) (NP more than 70 % unique i) (VP doubt) (SBAR that) (NP google) (VP can do) (NP anything) (PP about) (NP that) .

    it - Noun phrase
    all - Noun phrase
    depends - Verb phrase
    how - Adverb phrase
    you - Noun phrase
    spin - Verb phrase
    if - Clause introduced by subordinating conjunction
    you - Noun phrase
    just - Adverb phrase
    spin - Noun phrase
    change - Verb phrase
    synonyms - Noun phrase
    google - Noun phrase
    might catch - Verb phrase
    with - Prepositional phrase
    that - Noun phrase
    if - Clause introduced by subordinating conjunction
    you - Noun phrase
    change - Verb phrase
    whole sentences - Noun phrase
    making - Verb phrase
    them - Noun phrase
    atleast - Noun phrase
    more than 70 % unique i - Noun phrase
    doubt - Verb phrase
    that - Clause introduced by subordinating conjunction
    google - Noun phrase
    can do - Verb phrase
    anything - Noun phrase
    about - Prepositional phrase
    that - Noun phrase
     
    SwimFinn, Jun 3, 2010 IP
  10. dyadvisor

    dyadvisor Peon

    Messages:
    693
    Likes Received:
    19
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Swimm Finn your first posting is very true. Plus the OP did not even get into what is called operation "Mayday", by those being hurt with even newer changes. (look up Matt Cutts Google mayday) The info leaked out, I believe 5/21/10

    As for your second posting, here. If you compare your words with Google stop words, my list has almost every one of them. So yes, good or bad, we have to keep changing. However change means opportunity. Your knowledge of this will certainly pay off ---good comments---
     
    dyadvisor, Jun 3, 2010 IP
    usasportstraining likes this.
  11. SwimFinn

    SwimFinn Peon

    Messages:
    123
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #11
    I'm aware of the "Mayday" algorithm update. We're discussing duplicate content detection, not long tail search results.

    I'm also aware of Google stop words, which are words which are ignored when parsing while performing searches. Again, we're discussing duplicate content analysis, not search processing. My second posting's list of words was also an example of simple parsing, not of specific content intended to deal with words used in content analysis.
     
    SwimFinn, Jun 4, 2010 IP
  12. dyadvisor

    dyadvisor Peon

    Messages:
    693
    Likes Received:
    19
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Swimm Finn

    Here is the subject line, in case you did not read it close enough. Why Article Spinning Doesn't Work (aka How Google is Structured

    The OP asked kindly not for all criticism, but what may be lacking or other points to consider for his final report. I said your points are valid, just like mine. If you read intensely from those who questioned Matt of Google, "Mayday" has little effect on long tail search, and will have more effect on word structuring. This is where many retail stores are currently see their searches (sales and customers plunge) by up to 50%. They Also must use much use the Google word structuring that Mario is referring to. So it effects all business orientated writing also.

    It is such a complex matter, and incorporates past and new Google policies. I am sure that when he does his finalized report it will include quite a few areas. My comments were in additional, as were yours. If you have additional insight on word structuring it would also be helpful.

    I think he said report not book, but yet detailed. The new relevancy balance between links and dominant keyword structure is even another tangent.
    Unless the OP states differently, he wanted input about word structure theory plus when and how it comes into play. Not criticism to evolve.
    Speaking only for myself, and also studying this area, I will be very interested in this final report and that is why I highly complimented him. I learn from the people I tutor, no one has everything mastered 100%. I encourage posts from others.

    --so you were swimming perfectly in the right direction, do not suddenly get a long tail in your way-------have a good day----
     
    Last edited: Jun 4, 2010
    dyadvisor, Jun 4, 2010 IP
  13. contentboss

    contentboss Peon

    Messages:
    3,241
    Likes Received:
    54
    Best Answers:
    0
    Trophy Points:
    0
    #13
    Google uses shingling, and shingle ordering, same as the other engines. They also look for syntactical aberration in-page (i.e. without reference to other pages) because it's a computationally VERY cheap method of weeding out stuff that's either VERY badly written, or VERY badly spun, e.g. with a collaborative spinner.

    The OP unfortunately hasn't thought it through - except for about 2% of the language, most English words can't be disambiguated without context. Which also deals with the 'stop word' obsession of the other poster - the stop words are actually essential in order to establish context. So every word would have to effectively be a hyper-dimensional matrix of synonyms where the branes are contextually positioned.

    While syntax checking and shingling work so well, why invent something more complicated?
     
    contentboss, Jun 4, 2010 IP
    Ajeet likes this.
  14. SwimFinn

    SwimFinn Peon

    Messages:
    123
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Bingo. Or maybe "singularity" is more appropriate.
     
    SwimFinn, Jun 4, 2010 IP
  15. Perry Rose

    Perry Rose Peon

    Messages:
    3,799
    Likes Received:
    94
    Best Answers:
    0
    Trophy Points:
    0
    #15
    Jesus, dyadvisor, can you get any more fucked up than you already are right now?


    Good point, CB. Another reason why spinners are a waste of time.

    I only wish there was plenty of well-researched material out there to educate noobs on artricle spinners so they don't get ripped off.

    There's enough shit out there as it is.
     
    Perry Rose, Jun 4, 2010 IP
  16. SwimFinn

    SwimFinn Peon

    Messages:
    123
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #16
    So we should all write articles about spinners?
     
    SwimFinn, Jun 4, 2010 IP
  17. joshvelco

    joshvelco Peon

    Messages:
    819
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #17
    Very interesting idea. If I was creating an algorithm to check duplicity of content, I would use a dictionary/thesaurus file, and check for words synonymous of each other. Then include a check for keywords which feature in both e.g in your example "wings" as there are some words you just can't change without the text menaing something completely different.
     
    joshvelco, Jun 5, 2010 IP
  18. contentboss

    contentboss Peon

    Messages:
    3,241
    Likes Received:
    54
    Best Answers:
    0
    Trophy Points:
    0
    #18
    depends on the spinner.

    collaborative spinners are almost a definition of 'a waste of time'. After all, if I said to you "Give me $80 and I'll get a bunch of semi-literate blackhatter-wannabees to rewrite your articles for you in excrutiatingly poor eeeengleeeesh" would you say "wow, great deal" or would you say "*^G ^ &^&^ yourself".
     
    contentboss, Jun 5, 2010 IP
  19. mjtaylor

    mjtaylor Member

    Messages:
    68
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    25
    #19
    I would love to know how you know this. I haven't heard the term, 'shingling' but any insight into how SE algo's might work is very interesting.

    And I had to look up disambiguated. You know you are speaking way over most of the heads here, right? Don't get me wrong, you seem to know things I want to know ... but if I don't know your vocabulary, you can bet most people don't.

    Branes? As in string theory?

    Please come down to earth and teach us. You have so much to offer! :)

    Thanks!!
     
    mjtaylor, Jun 5, 2010 IP
  20. dyadvisor

    dyadvisor Peon

    Messages:
    693
    Likes Received:
    19
    Best Answers:
    0
    Trophy Points:
    0
    #20
    Mjtaylor, I agree the terminology is rather intense, but so are the methods used. After reviewing and doing some additional checking, it could in some fashion be equated to imitating human intelligence while considering the powers of Google.
    -----------------------------------------
    Yes Perry Bouy

    There is still another one here at the Forum that goes way above your stained pant legs.

    With the help of Content Boss, I was able to see some of the complex changes he is referring to.
    Spinning is commonly thought of as simply replacing one word with another.

    I now know of two sources, the superior being that of under Content Boss that take it much deeper into restructuring not only portions of sentences, but the entire document. I know you hate to hear it, but it has a has a major impact on enhancing that complexity. The result is more like a total rewrite.

    If the original was of high quality, the next version is of equal quality. Yet if put in with 9 originals, it would be good enough to pass as the 10th original. Of course this technology is not cheap, so sorry Perry that would completely put it out of your range.---------------------------------
     
    dyadvisor, Jun 5, 2010 IP