1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

New SPAM sites...billions of results!!!!

Discussion in 'Google' started by Nintendo, Jun 17, 2006.

  1. stringerbell

    stringerbell Peon

    Messages:
    78
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #701
    OK, now here's what I don't get... Shouldn't it be much easier to shut these guys down?

    For instance, no one really cares about the MFA sites that are on page one thousand of the SERPS - they care about these types of sites that get 60 of the top 100 places (and have millions and millions and millions of content-less pages).

    So, wouldn't it be easy for Google to make three lists:

    1. A list of all their AdSense publishers. This is the list the others work from.

    2. Taking the first list, find the top 1,000 sites (picking an arbitrary number here) by number of pages indexed.

    3. Make another list of the top 1,000 sites that have the highest results in their placement algorithms (PR, links, etc...), basically a list of sites that their system marks as the most respected/best for their search results (ignoring specific search terms - just an aggregate of all search queries - the sites with the most Google juice across the board).

    Then, every day, Google assigns 1 person to visit every site on each list. It only takes a few seconds to see if a site is just spam - and if it is, they just remove it from the system. Heck, hire 10 people to do this and expand each lists to the top 10,000.

    That wouldn't solve the MFA problem, but it would make their top search results about a thousand times better.

    Cheers,

    Bob
     
    stringerbell, Jun 23, 2006 IP
  2. stringerbell

    stringerbell Peon

    Messages:
    78
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #702
    PS. And, I always thought this lapse in Google's logic was brilliant:

    Let's assume there are only two types of sites: black-hat (MFAs, etc...) and white-hat (everyone else).

    Google wants to get rid of the black-hats and keep the white-hats.

    So, what can we assume:

    The black-hats will ALL be designed perfectly - they'll have great SEO, tons of inbound links, etc... All of them! Because, every black-hat will be run by an expert SEO.

    The white-hats will be all over the board. Some will have terrible SEO, some won't. Some will be optimized, some won't.

    So, basically, you have the bad guys doing only A - and you have the good guys doing A and B.

    So, what does Google do - they punish all the sites that do B!
     
    stringerbell, Jun 23, 2006 IP
  3. Obelia

    Obelia Notable Member

    Messages:
    2,083
    Likes Received:
    171
    Best Answers:
    0
    Trophy Points:
    210
    #703
    Not all spam relies on Adsense though.

    Good idea. Or, they could just do a manual check on the top 5000 Alexa results - I think there would be almost the same sites in both lists.
     
    Obelia, Jun 23, 2006 IP
  4. stringerbell

    stringerbell Peon

    Messages:
    78
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #704
    Yeah, I just said AdSense, cause Google wouldn't have access to anybody else's publishers. And, isn't that what Google would want? The quality of their results to go up - while everyone else's stay the same? ;o)

    And, the top 5000 from Alexa would work perfectly too.

    So, if in just a matter of minutes, we can figure out two extremely cheap and easy ways to get rid of the majority of the worst offenders - is Google really doing anything about the problem?...
     
    stringerbell, Jun 23, 2006 IP
  5. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #705
    Nintendo, Jun 23, 2006 IP
  6. lorien1973

    lorien1973 Notable Member

    Messages:
    12,206
    Likes Received:
    601
    Best Answers:
    0
    Trophy Points:
    260
    #706
    Is my motto this week to change my sites from:

    domain.com/page.html to page.domain.com? Is that the lesson I should take away from life this week?
     
    lorien1973, Jun 23, 2006 IP
  7. stringerbell

    stringerbell Peon

    Messages:
    78
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #707
    That's the first lesson! Once completed, you can move onto lesson #2:

    1.page.domain.com
    2.page.domain.com
    3.page.domain.com
    ...
    1000000000.page.domain.com
    1.page2.domain.com
    2.page2.domain.com
    3.page2.domain.com
    ...
     
    stringerbell, Jun 23, 2006 IP
  8. TheHoff

    TheHoff Peon

    Messages:
    1,530
    Likes Received:
    130
    Best Answers:
    0
    Trophy Points:
    0
    #708
    Personally, I wouldn't. Whatever advantage to indexing speed there was will probably be removed soon. I don't think there was much advantage in the ranking of the sub-subdomains.
     
    TheHoff, Jun 23, 2006 IP
  9. lorien1973

    lorien1973 Notable Member

    Messages:
    12,206
    Likes Received:
    601
    Best Answers:
    0
    Trophy Points:
    260
    #709
    Yeah. I agree its a loophole waiting to be closed. I'd never make that change for various reasons. Not the least of which is, I'm lazy and its too much work for little reward :p
     
    lorien1973, Jun 23, 2006 IP
  10. Obelia

    Obelia Notable Member

    Messages:
    2,083
    Likes Received:
    171
    Best Answers:
    0
    Trophy Points:
    210
    #710
    I don't think it's as simple as that, a lot of spammers fly under the radar by having networks of sites that on their own aren't large enough to attract as much attention as the billion-page guy, but nevertheless go into thousands of pages each.

    I think the thing that will stop spam dead in its tracks is when the average price of a domain exceeds the money that can be made from one MFA site. Now, domains are dead cheap. But for a MFA to break even, it needs thousands of pages, each targetting a different set of keywords. So Google needs to pay extra attention to sites with thousands of pages.

    What I'm thinking is that a special algorithm kicks in at certain page thresholds, to detect spamminess. The sort of algorithm that is probably too processor-intensive to use on all the index. Like this:

    1. An advanced dupe-content detector that parses out the most frequent keywords and examines what's left for similarity to other pages on the rest of the site that have also been stripped of their most frequent keywords.

    2. A gobbledegook detector to search for poor grammar, and flag up pages with a high percentage of ungrammatical sentences.

    3. Most sentences include at least some stop-words. An unusual lack of these suggests spam.

    4. Something to look for misspellings as a percentage of text. A lot of spammy sites have long lists of these, so anything over 20% should trigger a filter.

    On their own, none of these things is necessarily spam, but it could be used to find sites which it would be a good idea to throttle further indexing, and flag up for manual scrutiny.
     
    Obelia, Jun 23, 2006 IP
  11. Dekker

    Dekker Peon

    Messages:
    4,185
    Likes Received:
    287
    Best Answers:
    0
    Trophy Points:
    0
    #711
    You know what would be simpler? When a site hits 1 million + indexed, just do a freaking manual check
     
    Dekker, Jun 23, 2006 IP
  12. mvandemar

    mvandemar Notable Member

    Messages:
    2,409
    Likes Received:
    307
    Best Answers:
    0
    Trophy Points:
    230
    #712
    How'd you find them Nintendo...? :D

    -Michael
     
    mvandemar, Jun 23, 2006 IP
  13. TheHoff

    TheHoff Peon

    Messages:
    1,530
    Likes Received:
    130
    Best Answers:
    0
    Trophy Points:
    0
    #713
    Whoa hold on there.. Sapo.pt looks legit.

    Loosely translated:

    From 1994 to 2005, something something. Here is the history of the number 1 Portuguese portal.

    They've been in the top 1000 for a long long long time (+5 years), even as high as 200:

    http://www.alexaholic.com/sapo.pt
     
    TheHoff, Jun 23, 2006 IP
  14. mvandemar

    mvandemar Notable Member

    Messages:
    2,409
    Likes Received:
    307
    Best Answers:
    0
    Trophy Points:
    230
    #714
    Sapo is a free hosting service similar to Hostrocket, where each site they offer is a subdomain off of the main one. Nintendo grabbed that from a thread in another forum here (at least I think he probably did). The page count is probably different than mine, it's near the end, post #55.

    The actual subdomain I was referring to is this one. Apparently Sapo gives you unlimited sub-sub domains as well. There are others, that one is just bothering me because it is
    a) Cluttering up pages I'm competing with
    b) one that I myself reported to G for spamming a while back, and
    c) at the time I reported it was taking up 17% of the listed serps.

    Since G seems to be relying on posts to find the spammers, I figured mentioning them might actually get something done. :)

    -Michael
     
    mvandemar, Jun 23, 2006 IP
  15. huntz

    huntz Well-Known Member

    Messages:
    694
    Likes Received:
    109
    Best Answers:
    0
    Trophy Points:
    133
    #715
    huntz, Jun 24, 2006 IP
  16. mvandemar

    mvandemar Notable Member

    Messages:
    2,409
    Likes Received:
    307
    Best Answers:
    0
    Trophy Points:
    230
    #716
    Anyone know where Matt said this originally?

    Thanks.

    -Michael
     
    mvandemar, Jun 24, 2006 IP
  17. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #717
    Nintendo, Jun 24, 2006 IP
  18. Reginald

    Reginald Peon

    Messages:
    11
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #718
    Uhm, pardon the question but would this in any way be related to domain kiting?
    See www.bobparsons.com
     
    Reginald, Jun 24, 2006 IP
  19. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #719
    http://www.bobparsons.com/DomainKiting.html

    These domains were registered in May. I think that's way too long ago to still work and not of paid for them.

    http://www.bobparsons.com/MayKiting.html

    The SPAM domains have been around way longer than five days!!!
     
    Nintendo, Jun 24, 2006 IP
  20. anthonycea

    anthonycea Banned

    Messages:
    13,378
    Likes Received:
    342
    Best Answers:
    0
    Trophy Points:
    0
    #720
    Too bad we did not work reciprocal link deals with this guy! :rolleyes:
     
    anthonycea, Jun 24, 2006 IP