Google SPAMS itself!

Discussion in 'Google' started by CrankyDave, Jul 10, 2006.

  1. yfs1

    yfs1 User Title Not Found

    Messages:
    13,798
    Likes Received:
    922
    Best Answers:
    0
    Trophy Points:
    0
    #21
    Im not sure what your comment means...Google doesn't ban people for duplicate content.

    They only "penalize" one of the copies (Mad4 hit it on the head in his post)

    as far as the Spam point, how is it Spam? Certainly you can see someone as big as Google having the need to also have the subdomain. If people want to get to Google Base, they are going to do a type in of two variations. google.com/base and base.google.com

    They aren't creating the two to manipulate their rankings
     
    yfs1, Jul 11, 2006 IP
  2. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #22
    I don't recall saying anything about "banning"? They penalize by "de-indexing" the duplicate content. What do you do when every single page is identical? Define this for me...

    Identical content being served up from a folder off the root AND a subdomain off the same root.

    Since when does anything have to be intended to manipulate rankings for it to be SPAM?

    How many identical sites in the index are permissable by your standards? How exactly does having 2 identical sites benefit the searcher?

    The need to have to have two sites of more than 2.3 million pages each containing identical content in the index?

    Take a look at craigslist.org. Do a Google search for "wedding forum" (minus the quotes) and jump to page 5 or 6. At what point does this become SPAM? Never?

    As far as my comment's meaning...

    If Google should decide that subdomains are going to to be viewed differently then they are currently (ie as a separate site rather than an internal page), then they have an identical site already in the index as folder.

    Dave
     
    CrankyDave, Jul 11, 2006 IP
  3. mad4

    mad4 Peon

    Messages:
    6,986
    Likes Received:
    493
    Best Answers:
    0
    Trophy Points:
    0
    #23
    De-indexing is the same as a ban. If a site is removed from the index its banned.

    This is NOT what happens with the duplicate content filter. Sites filtered from the serps are NOT banned or de-indexed. They are still present in the index but are not shown in the results for certain queries.
     
    mad4, Jul 11, 2006 IP
  4. xeno

    xeno Peon

    Messages:
    788
    Likes Received:
    22
    Best Answers:
    0
    Trophy Points:
    0
    #24
    He was referring to tony's post

    they won't ban themself, but you can try yours! LOL
    Code (markup):
     
    xeno, Jul 11, 2006 IP
  5. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #25
    No they are not. Pages considered to be "duplicate", probably better said "identical", can and are removed from the index altogether all the time.

    Dave
     
    CrankyDave, Jul 11, 2006 IP
  6. mvandemar

    mvandemar Notable Member

    Messages:
    2,409
    Likes Received:
    307
    Best Answers:
    0
    Trophy Points:
    230
    #26
    Duplicate content does not equal automatic removal. This is an old theory that got twisted from one person's assumption like 2 years ago, that everyone started repeating. Seriously, it's a theory that was proposed to someone who asked why Google wasn't indexing their site, and someone on a another board replied with something along the lines of:

    "All your pages have very little content and look pretty much the same, why would Google want all of those copies?"

    This made sense, and people started to believe it.

    However, there are tons of examples of Google indexing the same content multiple times.

    The dup content theory refers to a possible cause why some people have trouble getting cookie cutter sites, or sites with many pages that all have the same header and footer and little middle content, indexed well in Google. Used to be they wouldn't get indexed. Then it was they'd get indexed but supplemental. Then supplementals started happening to non-duplicate pages. Then non-duplicate pages started to get deindexed as well. Even though these last two facts strongly point to the possibility that the supp and deindexing has nothing to do with page content period, people still start spouting about dupp content penalty.

    Even though the dupp content theory made sense and did have a certain logic to it, it was then and remains today a theory, and one that seems to make less sense than it used to. If there was some sort of auto-penalty, then you wouldn't get this kind of stuff indexed.

    Just mho.

    -Michael
     
    mvandemar, Jul 11, 2006 IP
  7. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #27
    Michael

    I don't disagree. Please note that I'm careful to term it "identical". It's no secret that "duplicated" content gets indexed. But "identical content" gets de-indexed all the time. The example I used with Google is a case of "identical content" and if anyone of us were to do the same thing... POOF!... at least one of the sites/folders would be gone for spamming. There are folks serving "identical" content on a .com and a .co.cuk that experience this first hand. What Google is doing is no different.

    Side note about duplicate content...

    http://www.google.com.my/support/bin/answer.py?answer=6805&query=duplicate+content&topic=0&type=f

    Now we both know, what they say and what they do are two different things... at least part of the time. :D

    Dave
     
    CrankyDave, Jul 11, 2006 IP
  8. mvandemar

    mvandemar Notable Member

    Messages:
    2,409
    Likes Received:
    307
    Best Answers:
    0
    Trophy Points:
    230
    #28
    Right, I know what you meant, but I'm saying that they do index identical content all the time. The examples I gave were of identical content. Look at craigs list. Tons and tons of pages with next to no content. Look at any site that has had 403 errors indexed. Honestly, it doesn't get much closer than that. Any site that has both the www and non-www versions indexed in Google has identical content indexed.

    It started as a rumor, and apparently got widespread enough that someone at Google support repeated it.

    I mean, honestly, it's bad SEO to try and rank 2 sites with identical content, and technically not something that's needed... but it's just not going to have the effect that people keep saying it is. The fact that it's easier to get unique content indexed has lead to a widespread obsession with how close is too close.

    Here, check this article out, it's very concise on the subject.

    -Michael
     
    mvandemar, Jul 11, 2006 IP
  9. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #29
    Thanx Michael. Actually, had already seen it. ;)

    What folks keep forgetting, is the republished articles are never identical content pagewise. Is the article duped, certainly, but every page is different. Different nav's etc. And you're right, the game is how much content actually needs to be unique.

    The 403 pages you linked are not identical. Actually, as a percentage of total content displayed (available), they're 15% unique in comparison to each other.

    Craigslist is a problem. Another problem with subdomains. I pointed to them a month or so ago as a deficiency with BD. A recent problem BTW. But even all their forum pages, are not identical despite the body of the content being identical. And yes, I consider those pages of duplicate content SPAM.

    As far as Google serving up identical pages in a folder and a subdomain, AFAIC that's SPAM. Slice it, dice it, serve it up any way you want. Still the same stuff.

    Dave
     
    CrankyDave, Jul 11, 2006 IP
  10. The Webmaster

    The Webmaster IdeasOfOne

    Messages:
    9,516
    Likes Received:
    718
    Best Answers:
    0
    Trophy Points:
    360
    #30
    Ofcourse that is. There is no point having identical pages at two places at the same site.
    but may be google doesnt know how to use 301 redirect ;)
     
    The Webmaster, Jul 11, 2006 IP
  11. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #31
    LMAO!!! Should we email them instructions...

    Dear Google

    In case you were unaware, you seem to be spamming yourself. The proper thing to do would be to use a 301 redirect. I'd be happy to send you detailed instructions if you would find it helpful.

    You're Welcome!

    Dave
     
    CrankyDave, Jul 12, 2006 IP
  12. The Webmaster

    The Webmaster IdeasOfOne

    Messages:
    9,516
    Likes Received:
    718
    Best Answers:
    0
    Trophy Points:
    360
    #32
    I bet this mail would feature in Matt's blog.....
     
    The Webmaster, Jul 12, 2006 IP
  13. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #33
    Where's Google's SPAM report when you need it! :D

    Dave
     
    CrankyDave, Jul 19, 2006 IP
  14. C. Szeler

    C. Szeler Member

    Messages:
    64
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    43
    #34
    This topic is pretty useless. They don't exactly penalise anybody for duplicate pages -- they just use ONE of the urls (assumingly, the most referenced url). Google sees my default.asp and directory / as the exact same page.. same pagerank, same backlinks.

    They are not spamming -- that's absurd. Spammers are people who send out unsolicited advertisements... not webmasters who LET search engines spider their sites.
     
    C. Szeler, Jul 19, 2006 IP
  15. Homer

    Homer Spirit Walker

    Messages:
    2,396
    Likes Received:
    150
    Best Answers:
    0
    Trophy Points:
    0
    #35
    He, he you guys r killen me!!!


    If anyone can offer a reasonable response to this:

    "what is spam"

    I am not sure, truly. I have many websites that are new and or 5+ years old. With the new Google I almost think everyone of them are spammy ;), even though I don't spam and all content is handwriiten. So why then have they all suffered so severly in the recent updates? Hmmm, must be spam...no, no possibly dup content, wait maybe its suspicious link bursts or ummmmm, well I give up it's spam, ya, that's it :confused:.

    WTF is on-page SPAM??


    Cheers ;)

    H
     
    Homer, Jul 19, 2006 IP
  16. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #36
    Your issue is canonicalization. Totally different thing. Yes, Google has gotten better at recognizing and correcting this. If you have your www.yoursite.com AND www.yoursite.com/default.asp AND www.yoursite.com/directory.asp indexed as your homepage, then it is because you have those links on your site all pointing to your homepage. All the SE's are doing is following links you have put on your site. You should use a 301 redirect to correct this because it can hurt your indexing and ranking. You should also make sure that you use only 1 URL for your homepage throughout your site. The preferred one would be www.yoursite.com

    Here's a link for you...

    Search Engine SPAM

    Dave
     
    CrankyDave, Jul 19, 2006 IP