I've been saying that Google is broken for several months now. I couldn't tell you how, but many of the updates and results just could not be explained otherwise. How here is a guy who may have the definitive answer. Read the article Is Google Broken
Wow, That's a good read. So google could have a small issue in as much as they have run out of ID's to assign each web page. It's a silly mistake really - Surprisingly similiar to the Millenium Bug problems that everyone was screaming about back in '99. I guess we'll just have to wait and see what happens.
The number of pages they show as being in the index has always been stale. It was never updated daily (or even quarterly for that matter). There may be technical reasons (that I can't think of) why it's not updated more often, or it may come down to it simply not being all that important to the average user. But a simple search for the word the shows 5.68 billion pages. And while most pages probably *do* have the word the, there are probably at least another billion that don't (pages in non-English for example).
I read about this well over a year ago, but tGoogle at the time tried to play it down. saying they had upped to a base 5 or similar which gave them infinately more holes for pages. I agree though that at the moment google is struggling. I have posted many times that I think it is a resource problem that is causing all these partially indexed pages. there is absolutely no rhyme nor reason to what G is doing. No pattern to it at all, which For a mathematically driven algorithm set s alarm bells ringing in my mind.
Not a new viewpoint but no less flawed: Doubtful -- change the word "added" to "reported" and I'd be more inclined to accept it. Again, doubtful -- change that word "added" again... Don't you find it just a little hard to swallow that if all this is true the guys that invented Google weren't aware of this a long time ago and haven't had months and years to anticipate and correct this? I'm reminded of Mark Twain's oft-quoted comment: "The rumors of my death have been greatly exaggerated."
a while back- less than a year, for sure- there was a CNN/newsweek article that mentioned that google increased their index by some huge number.
If you ever been thinking of programming your own search engine it's not hard to beleive that Google could have problems. There must be thousands and thousands of "easy" paths to walk wrong. The way Google rank will make a huge growth in number of pages hard to handle. Add the "antispam" mechanisms they must implement to prevent people from spamming the index in millions of ways. I'm impressed it worked without problems that long. Let's hope they can walk out of this before they are stucked by new dirt. They must dig really fast to not get stucked again. Why on earth would they risk their reputation in discussions like this? There must be more intelligent ways to fool us. (or maybe not)
Well I don't belive the DB to be broken or running out of indexing. I see a 12 digit alphacode taged onto my documents q=cache:3FKWikKk7p8Jthat only using standard alpha allows for a nice 16 digit numeric equivalent. But I do think G has a couple of minor problems like loosing direction and being chased and watched by everyone. Robert Wilensky said: We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare. Now, thanks to the Internet, we know this is not true. But G is certainly supporting a lot of monkeys... M
When things become inconsistent then you start to have doubts. When the search engine was built I am sure they wouldn't have thought about having the number of websites they have indexed, but I am confident they will have known about there own limitations. How many of us have added pages to our websites simply to aid SEO? Would the rapid growth of pages within websites been considered at that stage? Darren
The 32 bit limitation does not sound unlikely at all. In fact, in IT these limits are the rule rather than the exception. Anyone remembers the 640K memory barrier on the IBM PC? Many people now have a thousand times more memory than that. And more recently, many hardware providers have had to migrate from 32 to 64 bit memory addressing schemes in their operating systems. Such migrations are not simple and some manufacturers chose to go for in between solutions before the real migration. Maybe the problem is even more complex for Google and maybe they are also wrestling with intermediary solutions. But I'm sure Google does not want us - and the competition - to know everything: Don't show actual backlinks, PR, number of pages...
Question #1.....How many were deleted at this time? Could this just be a spin? Could this be part of why pages are 'banned' from time to time? Question #2...... How possible is it that yes, google is faced with a limit that makes it incomplete, and went public recently to make a ton of money before being 'found out'
The PageRank has not been updated for a long time now (at least the toolbar PR)... could this be the reason for the delay or is there something else stopping G from updating it for us.
Perhaps to huuuuge number of dynamic pages added to the internet (possibly infinite) might have been the cause of their problems. A few years ago a webmaster knew HTML and a bit of javaScript. Most now know HTML and at least PHP or ASP enough to create one simply script which can access a database and serve up thuosands of pages.
My guess is it's simply not a priority for Google and thy've had better things to do with their time lately. Let's face it: the only people who care about that green bar are people like you and me who own websites, and we shouldn't care about that green bar either but by now we're addicts. Google's "clients" aren't webmasters: They are (1) normal people doing searches (webmasters clearly aren't "normal" ) and (2) advertisers, and neither of these groups gives a hoot what the PR on my web page is.
It's not a simple as a 32-bit number. You have to realize that Google has to keep all that information in memory somewhere. There scheme to map indexed pages in memory may have reached unexpected limitations. One thing I notice is that if I change up keywords slightly I tend to still get a lot of the same pages. I know most people just search for "Paris Hilton" or whatever, but I often find myself searching for something more specific like information on "linux usb device setup <exact error text>", but what I notice is that even when I get more specific I still get the same crappy pages. I almost always end up resorting to posting a knowledgeable linux forum where someone can direct me to the proper page. What this illustrates to me is that Google was unable to handle the more specific and less used pages with the information that I needed. What the article tried to demonstrate is that Google is more concerned with the popular "Paris Hilton" searches and not the less-used pages people like myself look for. I would imagine this would be a problem for any search engine builder, but to pretend that Google resides on mount olympus is ridiculous. Seriously, the people at Google are regular humans and therefore must suffer from the same realistic limitations that the rest of us do. For those of you who don't use Firefox, there already is a plug-in to block Google ads. Just imagine if IE did something similar.
That hasn't been my experience with Google at all. However, I would note that when I'm searching for something like your example, "linux usb device setup <exact error text>", I use Google groups rather than the primary Google search engine -- that usually returns the information I need.
"When the search engine was built I am sure they wouldn't have thought about having the number of websites they have indexed, but I am confident they will have known about there own limitations." ~that's probably the issue. Sometimes even people with great foresight can rush into things without considering everything.
the last few messages involve user searches including quotes. This is not far from what I see in my server logs; many google referrers use quotes intensively. Do you see as many quoted search terms as I do? Is it mass self-education or is it that the main index became so full of seo inflated pages that it is useless? I personally get far more relevant pages back when I quote search terms together. If i think more about it, I quote more than 90% of my searches. I guess I rank really bad for most of my keywords, and this is why most of my google visitors use "" in their searches. tell me this is true!