1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

2 different "site:www.domain.com" problems

Discussion in 'SEO' started by theblackjeep, Jul 6, 2006.

  1. #1
    I have 2 sites for one of my companies that are different from each other, and no duplicate content issues. Submitted sitemap 6 months ago and one site shows 14,500 pages indexed, but there are only 2,200 pages in the site (7 year old). The other site has 9 pages indexed out of 1,800 (4 year old). Of the 9 pages, 8 are supplimental and only the home page is indexed properly. I reported the overindexing to Google about 2 months ago and within 48 hours they reduced the amount of indexed pages to 500 pages. I promote the site with more pages a lot more than the other and it gets good rankings. The other site is just a paranoid 'just in case' site that only gets promoted when things are slow.

    The 14,000 indexed pages site is considered an authority site and has been a page 1 site for 6.5 years (except for a few weeks here and there during a Google fart).

    I know of the site: problems which explains the other site, but I keep getting overindexed on the other. Anyone have this, or any explaination.
     
    theblackjeep, Jul 6, 2006 IP
  2. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #2
    "Bad Data Push"

    More aptly put, BDP = "Big Daddy Problems"

    DAve
     
    CrankyDave, Jul 6, 2006 IP
  3. TechEvangelist

    TechEvangelist Guest

    Messages:
    919
    Likes Received:
    140
    Best Answers:
    0
    Trophy Points:
    133
    #3
    Are you certain that you do not have a session ID problem with the site? I've seen 100 page sites generate over 1000 URLs due to a session ID showing up in the URL.

    A session ID should never show up in a URL. It creates multiple URLs representing a page, which in turn creates a duplicate content penalty and the pages get tossed into supplemental results.

    If that's not the problem, then it is likely a Big Daddy issue.
     
    TechEvangelist, Jul 6, 2006 IP
  4. theblackjeep

    theblackjeep Peon

    Messages:
    54
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    No session id's.

    The BDP has been referenced to spam sites. There is nothing spammy about my site. I have gone over everything and there is no similarity between my site and the site with 5 billion pages indexed. The site is PR5 with 42 BL (Google's numbers). Very careful relavent linking.

    Definitely a BD issue, but I can't figure out why it keeps happening to this site.
     
    theblackjeep, Jul 6, 2006 IP
  5. mad4

    mad4 Peon

    Messages:
    6,986
    Likes Received:
    493
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Has your sites got unique titles, content, metatags, h1 tags for each page?
     
    mad4, Jul 6, 2006 IP
  6. theblackjeep

    theblackjeep Peon

    Messages:
    54
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Yes. Every page is unique, proper use of code. I do not submit my site to just anywhere either. The ranking for this site is great, so I suppose I should not care so much. But the site query problem associated with BD is deindexing, and the BDP example that I have seen does not seem to apply.
     
    theblackjeep, Jul 6, 2006 IP
  7. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #7
    I refenced a "bad data push" since it pointed to the "bad data" as being the number of pages being shown to be in the index. IMO it nothing to do with spam.

    If you think about it for a moment, the term was in no way used to describe the spam, just the number of pages being shown to exist. It was a "generic" term used to describe the error (# of pages) being shown by the site: operator. Everyone automatically attributed the phrase to mean the SPAM being indexed.

    If I were to hazzard a W.A.G.

    The number being shown is compilation of all of Google's indexes including Base, Froogle and the supplemental index. If those are not filtered properly, a site could easily show 3 times the number of pages just based upon content that Google has stored multiple times for the same URL.

    The supplemental index, has stored duplicate content for the same URL for some time now. This is why pages appear to be being moved from the RI to the SI index when in all actuality the RI index isn't going live and the duplicate data for that URL in the SI is being shown. This is going to affect older domains since there's been more time to obtain data for both the RI and SI. There are also URL only results, when a page is found but not yet crawled for the first time, Google still has stored as well.

    Dave
     
    CrankyDave, Jul 6, 2006 IP
  8. theblackjeep

    theblackjeep Peon

    Messages:
    54
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    I get the RI and SI movement. Out of the 14,000 pages, Google will only show around 1,000 if you take the results all the way to the end. I do have 2 domains that are older with indexing problems the other way. One has half pages indexed, and the other has 1 sup page in the index. But those are Google f****d up problems and that is common to many site owners. To have 1 site that has gotten over indexed a couple times now would seem to indicate a problem on my end. Otherwise it would have pages removed from the index like everyone else.
     
    theblackjeep, Jul 6, 2006 IP
  9. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Not neccessarily. Google has problems with sites that they have not been having problems with nor should they be having problems with. The apparent over indexing is definitely a Google problem.

    I have several sites. One of which was consistantly in the SI but recently half of the pages returned to the RI without any changes on my part. The others have been fine.

    DAve
     
    CrankyDave, Jul 6, 2006 IP
  10. theblackjeep

    theblackjeep Peon

    Messages:
    54
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    If this is a Google problem, and I will assume that it is, will the extreme quantity of pages trigger a duplicate filter. If I have approx 2,000 pages, and 14,000 in the site query, there would have to be some duplicate pages somewhere. Of those 14,000 pages, almost all are in the SI, so would the duplicate filter affect pages in the RI only or am I going to get filtered out?
     
    theblackjeep, Jul 6, 2006 IP
  11. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #11
    The duplicate filter is going to "remove" pages from the index completely, not just move them. I wouldn't worry too much about that. I seriously doubt they use total pages as a threshold. By comparison, 2000 pages would not likely be considered an "extreme" number of pages.

    Dave
     
    CrankyDave, Jul 6, 2006 IP
  12. theblackjeep

    theblackjeep Peon

    Messages:
    54
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Dave, by extreme I meant having my site indexed 7 times over. Since the file structure has not changed in 6 years, I basically have 7 copies of my site indexed. Since the 4th level pages are probably not indexed as much, who knows how many versions of my main page they are counting. Hate to sound paranoid here, but is is possible I have a hundred copies of my main page in the SI. The site ranks well, so maybe I should just be amused. Think I should just leave it?

     
    theblackjeep, Jul 6, 2006 IP
  13. CrankyDave

    CrankyDave Peon

    Messages:
    280
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #13
    Anytime you flirt with "identical" content, you run the risk. You already know that. ;)

    Given the current *supposition* that the deeper your navigation levels go, the less likely BD is to index the page, I'd tend to smile, and be amused. Rankings are what count.

    Dave
     
    CrankyDave, Jul 6, 2006 IP