1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

How can Google beat the spam?

Discussion in 'Search Engine Optimization' started by Obelia, Jun 25, 2006.

  1. #1
    This is following on from the thread about billions of spam pages in Google's index: http://forums.digitalpoint.com/showthread.php?t=97090

    I think it might be helpful to split this topic off rather than hijacking the discussion on an already-massive thread. Just to recap some good ideas that people had:

    Stringerbell suggested Google do a manual check on the top 1000 Adsense publishers, rated according to their "Google juice" (ie, the overall best placed), and the top 1000 sites by indexed pages.

    SVZ suggested a manual check on sites that reach over 1 million pages indexed.

    I suggested an algorithm that kicks in for sites with lots of pages, to look for misspellings, poor grammar, and sentences without stop words by a percentage of overall text; and also an advanced dupe content filter that takes out the most frequent keywords.

    pjbrunet mentioned dupe-content checkers like turnitin, and the readability tests you can do to figure out the grade-level of text.

    Here's an extract of some spam from one of the sites mentioned in that thread. This is to give you some idea of what Google is up against.


    This random text is sliced oddly, as though it's come from someone's porn ebook with every fourth word missing, or some such. I'm not sure that any of the methods so far mentioned would be able to catch it with an algorithm, except perhaps a grammar checker.
     
    Obelia, Jun 25, 2006 IP
  2. TorchedSEO

    TorchedSEO Well-Known Member

    Messages:
    369
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    108
    #2
    As long as googlebot remains an automated crawler theres nothing they can do about the spam. Blackhat seo is a cat and mouse game and google is always trying to catch up. As for checking the top 1000 adsense publishers that won't really do much since non spam sites don't need to be in the top 1000 to make serious money or even use adsense. Same goes for the ammount of pages indexed, most people who do bh seo would setup 1000 domains with 10,000 pages each instead of 1 domain with 1 million pages.
     
    TorchedSEO, Jun 25, 2006 IP
  3. CJan_NH

    CJan_NH Peon

    Messages:
    61
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #3
    The irony of your keyword choice in this post floored me :eek:

    I am currently optimizing for Hilton Head Rental/Hilton Head Timeshare :D

    Edited to add: I'm doing it the right way though...
     
    CJan_NH, Jun 25, 2006 IP
  4. Obelia

    Obelia Notable Member

    Messages:
    2,083
    Likes Received:
    171
    Best Answers:
    0
    Trophy Points:
    210
    #4
    I got it from one of the spam sites mentioned in the "billion pages" thread, it was one of the first pages to come up on a site: query. Actually, I thought that was a semi-nonsense set of keywords, and not something you would actually optimise for. Shows what I know.

    True. But they have to at least try, or else they might as well shut up shop and start building a directory.

    All it would do is remove the most prominent spam sites, and ever so slightly increase costs for blackhats. So, more of a public relations exercise than anything else.

    I'm beginning to think that the survival of search engines will hinge on just one question: is there any way for a machine to distinguish grammatically correct but random nonsense from real information?
     
    Obelia, Jun 28, 2006 IP
  5. Mr.Dog

    Mr.Dog Active Member

    Messages:
    912
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    60
    #5
    And: you should just check some major travel sites and you'll find them scoring in the top 10 in Google, while they have long lists of spammy links... keyword-repeating links and lists.
    But Google kicked lots of ethical sites... It seems like especially the small businesses, small sites, personal sites get hit and the big ones aren't that affected.

    Now I wonder if your post here could be considered as "spam" by Google?
    Is Google capable of telling that we're just talking about spam here and that you're not actually spamming?

    I guess not...

    I guess to fight spam better they have to look ad the visitor metrics.

    Signs up spam in Analytics might be:
    -very high bounce rate
    -very low avg. page views
    -very few returning visitors
     
    Mr.Dog, Nov 4, 2012 IP
  6. KimWilhelm

    KimWilhelm Peon

    Messages:
    68
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Thanks for sharing the information.
     
    KimWilhelm, Dec 17, 2012 IP
  7. babusaheb144

    babusaheb144 Peon

    Messages:
    8
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    it may be helpful for hijacking like many more problem which comes in google.
     
    babusaheb144, Dec 17, 2012 IP
  8. babusaheb144

    babusaheb144 Peon

    Messages:
    8
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    when people spaming any thing in the your site google can look the record of visitor in the google.
     
    babusaheb144, Dec 17, 2012 IP
  9. PassGoSEO

    PassGoSEO Member

    Messages:
    260
    Likes Received:
    24
    Best Answers:
    0
    Trophy Points:
    35
    #9
    Until this year, that would probably have been about correct. It's no longer true though. Their systems have now gotten close to the level of spam detection a human visitor could employ, and that's enough to weed out 99% of the crap. The other 1% will get spotted by being reported, or suspicious adsense activity, or unnatural link profiling.

    Blackhat... R.I.P.
     
    PassGoSEO, Dec 17, 2012 IP
  10. Salon Alure

    Salon Alure Member

    Messages:
    73
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    36
    #10
    I think it is difficult for Google to avoid spam
     
    Salon Alure, Dec 17, 2012 IP
  11. PassGoSEO

    PassGoSEO Member

    Messages:
    260
    Likes Received:
    24
    Best Answers:
    0
    Trophy Points:
    35
    #11
    It's difficult, but they sure seem to be making a better job of it this year than last.
     
    PassGoSEO, Dec 17, 2012 IP
  12. traxport121

    traxport121 Active Member

    Messages:
    1,201
    Likes Received:
    8
    Best Answers:
    1
    Trophy Points:
    63
    #12
    Google is certainly getting smarter about that. Previously spam survived for quite a long time unnoticed but today its detected in quite a short period of time. Manual reviews are also playing an important role in this regard apart from the technological advancement and use of Artificial Intelligence.
     
    traxport121, Dec 17, 2012 IP
  13. Mr.Dog

    Mr.Dog Active Member

    Messages:
    912
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    60
    #13
    Spammers are getting more and more sophisticated. I'm seeing more and more farmed hyper-spam content that's a more elevated version of spam. More words, better-built-up context, but still spam.

    I think Google is having a hard time fighting it.
     
    Mr.Dog, Dec 18, 2012 IP
  14. PassGoSEO

    PassGoSEO Member

    Messages:
    260
    Likes Received:
    24
    Best Answers:
    0
    Trophy Points:
    35
    #14
    What makes you say that? Do you have a keyword search that results in spammy SERPs? (A keyword search that could be used to make money, obviously).
     
    PassGoSEO, Dec 18, 2012 IP