Google's index limit? 32 bit?

Sholva Peon

Messages:: 154

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#1

Sometime about a year ago I remember reading one persons theory that Google is going to run into problem when it reaches 2^32 pages in it's main web index. That is 4,294,967,296 web pages and Google currently states on it's homepage that it has 4,285,199,774 web pages.

I've always been a big Google fan since about 2000 but I've noticed it seems to be losing its edge when it comes to certain searches (I'm pretty picky and looking for very specific things sometimes). It's still great don't get me wrong, but I've noticed a lot of so called "dancing" lately, sites disappearing and reappearing into SERP's. Completely new pages being added then disappearing is what concerns me. While comparing these results to Yahoo/Overture when they've added the new pages lightning fast and they're staying there.

So has anyone got any ideas or theories about the 32bit limit? I'd assume Google being a bunch of smart cookies should easily be able to overcome a theoretical problem like that.

Another reason is that Google is trying to limit the "spammy" results from weeding too far into its index, what are your thoughts?

Sholva, Sep 23, 2004 IP

dkalweit Well-Known Member

Messages:: 520

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 150

#2

Sholva said:

Sometime about a year ago I remember reading one persons theory that Google is going to run into problem when it reaches 2^32 pages in it's main web index. That is 4,294,967,296 web pages and Google currently states on it's homepage that it has 4,285,199,774 web pages.

<snip>

So has anyone got any ideas or theories about the 32bit limit? I'd assume Google being a bunch of smart cookies should easily be able to overcome a theoretical problem like that.
Click to expand...

I'd be very surprised if Google hadn't addressed the 32-bit problem a long time ago. They could have moved their 'primary key' field to 64-bit, or a GUID, or maybe they don't even need a 'primary key' in their database at all-- the URL of the page itself is, by definition, unique and therefore could serve as a 'primary key'...

--
Derek

dkalweit, Sep 23, 2004 IP

Sholva Peon

Messages:: 154

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#3

I also think Google is quite capable of handling such a problem (if it even exists). Although it doesn't necessarily answer the question why they haven't broken any amazing numbers for their index.

I remember reading an article somewhere at the start of the year that a Google spokesperson had said they hoped to have 10 billion by the end of the year.

Though I suppose Google would pick quality over gross quantity.

Sholva, Sep 23, 2004 IP

xml Peon

Messages:: 254

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#4

It appears to me as if they HAVE sorted it:

http://www.google.com/search?q=the

Returns 5,800,000,000 results for me, 1505032704 more than 4,294,967,296.

xml, Sep 23, 2004 IP

dkalweit Well-Known Member

Messages:: 520

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 150

#5

xml said:

It appears to me as if they HAVE sorted it:

http://www.google.com/search?q=the

Returns 5,800,000,000 results for me, 1505032704 more than 4,294,967,296.
Click to expand...

And why, I wonder, is 'how' a filtered word but not 'the'?

Conspiracy theory here: Maybe the "the" search is smoke and mirrors on Google's part to make it LOOK like they handle more than 4 billion web pages... Maybe someone should go through and count each page to make sure. Volunteers?

--
Derek

dkalweit, Sep 23, 2004 IP

Sholva Peon

Messages:: 154

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#6

Yep, if you pay me 5 cents per page.

Of course I'm sure you're aware the figures quoted in the SERP's are only estimations not exactly figures.

Sholva, Sep 23, 2004 IP

fluke Guest

Messages:: 209

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#7

i'm sure i saw someone post somewhere that the "Â©2004 Google - Searching 4,285,199,774 web pages"

has been the same for at least the last year (apart from the 2004 and bit )
which is odd considering they are indexing new pages all the time.

fluke, Sep 23, 2004 IP

Jan Peon

Messages:: 129

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#8

See this thread:

Is Google broken

Jan, Sep 23, 2004 IP

nadlay Guest

Messages:: 306

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 0

#9

Try this link for a discussion on the theoretical limit of the Google index and how Google could address it.

Google Index ID

nadlay, Sep 23, 2004 IP

Sholva Peon

Messages:: 154

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#10

Thanks for the links everyone, the "Is Google Broken" link looks very familiar to me, but the date would indicate otherwise.

Sholva, Sep 23, 2004 IP

Sholva Peon

Messages:: 154

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#11

I think the example with the ID for cached documents is good enough, seems like theres enough ID space for 2^72 documents which is huge... like 4,722,366,482,869,645,213,696 is how many dollars you wish you had

Sholva, Sep 23, 2004 IP

DarrenC Peon

Messages:: 3,386

Likes Received:: 154

Best Answers:: 0

Trophy Points:: 0

#12

While comparing these results to Yahoo/Overture when they've added the new pages lightning fast and they're staying there.
Click to expand...

My experiences are that both of these SE's are notoriously slow at indexing new pages and new websites.

DarrenC, Sep 23, 2004 IP

mortgage-pro-seo Peon

Messages:: 170

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#13

I have a small site of around 200 pages. I been noticing some of my pages are dropping out of the index. Also some of my pages only list the url when I do a search site:mysite.com. No description or title tags showing in the google index. Here is a good article on that topic:

http:***//www.w3reports.com/index.php?itemid=549 remove the *** from the url

mortgage-pro-seo, Sep 25, 2004 IP

Mel Peon

Messages:: 369

Likes Received:: 14

Best Answers:: 0

Trophy Points:: 0

#14

I don't seem to see anyone mentioning that Google now have not one, but two indexes since the addition of their supplemental index.

IMO that solves the 32 bit address problem rather easily, if in fact it ever existed.

Mel, Sep 26, 2004 IP

Log in or Sign up

Google's index limit? 32 bit?

Sholva Peon

dkalweit Well-Known Member

Sholva Peon

xml Peon

dkalweit Well-Known Member

Sholva Peon

fluke Guest

Jan Peon

nadlay Guest

Sholva Peon

Sholva Peon

DarrenC Peon

mortgage-pro-seo Peon

Mel Peon

Useful Searches