Agreed. Ok so next step. I'll remove the "legibility/gibberish filter" since google can't spot that. I'll add in high density of keywords and subdomains and a few more things I see from these spam sites listed here. 1. ".info" 2. thousands of backlinks in a few days 3. all backlinks from blogs and forums 4. thousands of pages generated in a few days 5. high keyword density % 6. hundreds of subdomains under one domain 7. uses 301 / 302 or htaccess redirects (cloaking) 8. "nocache" meta tag for search engines on EVERY page for the domain and subdomain 9. double dashes in domain or subdomain name ( for example "9676.10.all--bankruptcy-data.info/" ) Now seriously, what legitimate site does that? So that's 9 footprints. And you're right, individually they mean nothing, but sites that match let's say 7 out of 9 should certainly draw scrutiny. And if they match 9 out of 9 then they HAVE TO BE SPAM. They should go straight to the "Tar Pit" as KLB so eloquently stated. That list comes from about 10 minutes of work. And google has had how many years to deal with this problem??? They have thousands of employees and some of the best minds in the business. And yet they can't spot a SE spam site or at a minimum not index it? That's troubling. Very troubling. EDIT: Oh I almost forgot. #10. They have adsense or some kind of Pay Per Click advertising on every page. Doesn't do any good to spam if you're not getting paid. So there we have it. 10 simple and yet very bright neon signs that scream SPAM!
why not just post a link to the blog instead of the digg page? I know its fishing for diggs but if the point is to point out spam, link to the blog instead (or do both), at least.
The average rssgm/sec site earns about $20 a day at its peak so these subdomain pages would probably earn just as much if not more than that. Multiply that by 200 domains and your in business.
I don't get it? What's this all about? They buy domains and then make subdomains? Interlink them or something? then what? How does this effect the serp all of a sudden? And nooo, I am never gona do any of those things! I just don't get it :s
i wish i knew how they did it..lol i have some spare urls.. but i wouldn't risk banning my adsense account or banning my other sites on the same ip
Google seems to slurp in about anything it sees. I agree, google should flag for a manual review..it really pollutes to engine..
What's even more odd is how MSN and Yahoo don't have an issue with this, just Google. It appears someone has cracked their engine =o
msn and yahoo have even bigger issues when it comes to indexing and ranking sites using bh methods its just that they are affected in different ways but anyone who knows how to game msn and yahoo wouldn't have a problem getting high volume keywords ranked in the top 10.
Getting ranked in MSN with a legitimate site is easy, they just seem to have better spam filters in place to detect this kind of thing.
Spam itself is the frustration of legitimate webmasters - while search engines tend to look at the revenue they get from a site to determine whether or not they ban it ... unless it gains high publicity forcing the search engine to take notice and be able to be seen as 'proactively fighting against spam' If it makes them money and stays under the radar of general public outcry, the chances are it will remain in place for a considerable amount of time... if not indefinately (but may suffer more minor penatlies). Billions of pages though is a different matter and that is simply too much to go unnoticed by the search engines and for them to protect their brand name by removin such blatant spam should be a main priority for them.
Can any one explain this one? Page one in the SERPS. Watch the weird redirect then scroll down to the bottom. http://mipagina.americaonline.com.mx/mattress4321/twin-mattress/twin-futon-mattress.html
Right wrkalot. They obviously have some holes though. And the mispelling things is how everyone is capitilizing or the fake afterdomains like www.legitdomain.com.mx
This page - http://www.mattress.four-corners.info/ Is inside of a script calling an iframe on this page - http://mipagina.americaonline.com.mx/mattress4321/twin-mattress/twin-futon-mattress.html The 2nd page is a keyword stuffed gibberish page, but google can read it and does. Google does not however read the first page inside the iframe. that's for the user. And just for kicks, here's a sample of the text that google thinks should rank on page #1 in the SERPS for the phrase "futon mattresses": " twin futon mattress (yes responsibility playing owner (it died swirls clarinet Ag leftovers cut weak transcendent outstretched aroma common stalked twin futon mattress reputed ascended lub-dubbing freedom Travelin' stove knowing fastballs action half-truth thumps sting partially galloped yet Bash arousing twin futon mattress occupy cubes dipped hadn't wonder; toss horizon interrogation heartbreak wrote breathes strobe-lit ; cabernet direction sales humiliated twin futon mattress (battery pointless hamburgers throws here; hill calls posturing oozing charged-up there's hadn't juncture peculiar acknowledging accepted bulging twin futon mattress Ty-fish crazed crudities eleven portraits) hesitating undergraduates ghostly" EDIT: to see the whole thing in action, and apparently for the script to work, you have to click on the SERP link that wrkalot posted from google.
The above spam/cloaked page has now been dropped to #10. Still on page 1 though. To give google some credit - the other 9 listings on page 1 appear to be right on target.