Google's index limit? 32 bit?

Discussion in 'Google' started by Sholva, Sep 23, 2004.

  1. #1
    Sometime about a year ago I remember reading one persons theory that Google is going to run into problem when it reaches 2^32 pages in it's main web index. That is 4,294,967,296 web pages and Google currently states on it's homepage that it has 4,285,199,774 web pages.

    I've always been a big Google fan since about 2000 but I've noticed it seems to be losing its edge when it comes to certain searches (I'm pretty picky and looking for very specific things sometimes). It's still great don't get me wrong, but I've noticed a lot of so called "dancing" lately, sites disappearing and reappearing into SERP's. Completely new pages being added then disappearing is what concerns me. While comparing these results to Yahoo/Overture when they've added the new pages lightning fast and they're staying there.

    So has anyone got any ideas or theories about the 32bit limit? I'd assume Google being a bunch of smart cookies should easily be able to overcome a theoretical problem like that.

    Another reason is that Google is trying to limit the "spammy" results from weeding too far into its index, what are your thoughts?
     
    Sholva, Sep 23, 2004 IP
  2. dkalweit

    dkalweit Well-Known Member

    Messages:
    520
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    150
    #2
    I'd be very surprised if Google hadn't addressed the 32-bit problem a long time ago. They could have moved their 'primary key' field to 64-bit, or a GUID, or maybe they don't even need a 'primary key' in their database at all-- the URL of the page itself is, by definition, unique and therefore could serve as a 'primary key'...


    --
    Derek
     
    dkalweit, Sep 23, 2004 IP
  3. Sholva

    Sholva Peon

    Messages:
    154
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I also think Google is quite capable of handling such a problem (if it even exists). Although it doesn't necessarily answer the question why they haven't broken any amazing numbers for their index.

    I remember reading an article somewhere at the start of the year that a Google spokesperson had said they hoped to have 10 billion by the end of the year.

    Though I suppose Google would pick quality over gross quantity.
     
    Sholva, Sep 23, 2004 IP
  4. xml

    xml Peon

    Messages:
    254
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #4
    xml, Sep 23, 2004 IP
  5. dkalweit

    dkalweit Well-Known Member

    Messages:
    520
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    150
    #5
    And why, I wonder, is 'how' a filtered word but not 'the'?

    Conspiracy theory here: Maybe the "the" search is smoke and mirrors on Google's part to make it LOOK like they handle more than 4 billion web pages... Maybe someone should go through and count each page to make sure. Volunteers? :)


    --
    Derek
     
    dkalweit, Sep 23, 2004 IP
  6. Sholva

    Sholva Peon

    Messages:
    154
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Yep, if you pay me 5 cents per page.

    Of course I'm sure you're aware the figures quoted in the SERP's are only estimations not exactly figures. :)
     
    Sholva, Sep 23, 2004 IP
  7. fluke

    fluke Guest

    Messages:
    209
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    i'm sure i saw someone post somewhere that the "©2004 Google - Searching 4,285,199,774 web pages"

    has been the same for at least the last year (apart from the 2004 and bit ;))
    which is odd considering they are indexing new pages all the time.
     
    fluke, Sep 23, 2004 IP
  8. Jan

    Jan Peon

    Messages:
    129
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Jan, Sep 23, 2004 IP
  9. nadlay

    nadlay Guest

    Messages:
    306
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Try this link for a discussion on the theoretical limit of the Google index and how Google could address it.

    Google Index ID
     
    nadlay, Sep 23, 2004 IP
  10. Sholva

    Sholva Peon

    Messages:
    154
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Thanks for the links everyone, the "Is Google Broken" link looks very familiar to me, but the date would indicate otherwise.
     
    Sholva, Sep 23, 2004 IP
  11. Sholva

    Sholva Peon

    Messages:
    154
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #11
    I think the example with the ID for cached documents is good enough, seems like theres enough ID space for 2^72 documents which is huge... like 4,722,366,482,869,645,213,696 is how many dollars you wish you had ;)
     
    Sholva, Sep 23, 2004 IP
  12. DarrenC

    DarrenC Peon

    Messages:
    3,386
    Likes Received:
    154
    Best Answers:
    0
    Trophy Points:
    0
    #12
    My experiences are that both of these SE's are notoriously slow at indexing new pages and new websites.
     
    DarrenC, Sep 23, 2004 IP
  13. mortgage-pro-seo

    mortgage-pro-seo Peon

    Messages:
    170
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #13
    I have a small site of around 200 pages. I been noticing some of my pages are dropping out of the index. Also some of my pages only list the url when I do a search site:mysite.com. No description or title tags showing in the google index. Here is a good article on that topic:


    http:***//www.w3reports.com/index.php?itemid=549 remove the *** from the url
     
    mortgage-pro-seo, Sep 25, 2004 IP
  14. Mel

    Mel Peon

    Messages:
    369
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #14
    I don't seem to see anyone mentioning that Google now have not one, but two indexes since the addition of their supplemental index.

    IMO that solves the 32 bit address problem rather easily, if in fact it ever existed.
     
    Mel, Sep 26, 2004 IP