Hi, I was just wondering if anybody knew how many pages a site like Google can have in their index. At the momment I think they have over 16 billion pages index, but as the web keeps growing at a rapid space and Google trying to get most sites index, I was just wondering when would the stage be when they will not be able to add any more new sites. Do you think they will be able to fit more than 900+ billion pages in their index.
Well, they make a LOT of money. So I think they can afford anything to take them to that limit. Skinny
I didn't really mean about the money side of things. I mean't about speed. Wouldn't query a 100 billion page index be much slower than querying a 10 billion page index?
OK! So the more data they have then the more computers they will join together to make a larger computer with greater CPU processing speed. Is that what they do.
Yah basically they know how many transactions 1 cpu can handle before they run into diifficulties processing, so they add another CPU until they can handle all the requests. Some of the computers may have quad processors or more ... but basically yes they keep adding more and more computers until they are happy with the results.
They may be the one company that is keeping "Intel" and "AMD" in business right now. If I were a sales representative for one of those companies, I'd sure like to have the Google account right now!!
Yeah, Imagine getting a commission for each sale that you generate from Google's custom due to being a sales rep. It must be quite a large amount. Anyway, I guess I will have to plan on buying somemore computers in the future so that I can increase my CPU also. At the moment some of my things that I want to do is holding me back due to only being able to afford one server. Things will change soon though.
Just do a search with * * currently it will give you the number quoted by qlink1 and they have 15,600,000 of their pages indexed!! They will certainly need to get more storage to keep up with all the web growth ... forums etc. I have my own conspiracy theory that they are running out of space and that is why they have been dropping pages from the index!!!
What database does Google use. Is it Oracle? If it is, then is oracle endless unlike mysql where you can only have a max amount of 2 - 4GB's of information in a table. I need a database that can hold more than 2-4Gb's of information per table. I need it so it can hold around 100gb's or more in the future. Also, does anybody know how much oracle would cost me if I purchased it. What would the cheapest price be do you guys think.
Try MySQL 5.0 - you can have many many multiples of TBs in a table - DEPENDING on the operating system. So, it is more important that you use the right OS. With Linux 2.4 and up, you should be okay up to 4TB. http://dev.mysql.com/doc/refman/5.0/en/table-size.html
I remember about ten years ago when AltaVista said they had 4TB's of space for data collection. My guess is that number could be added to by a factor of at least 100 or maybe 200 over the last decade when it comes to what Google has collected. Also, my guess is that Google is using something more "in house developed" than what most of us use for database delivery.
hi if u wann directory submition .please contract me.i have 600 dir list thanks Email:-dhananjay0005@gmail.com
Thanks for the offer dhananjay4win, but I don't really see what your post has to do with this topic. You are being off topic here.
If I'm not mistaken, the developer license whould cost you about 15K for Oracle (but if you're going to use it for search engine development, be sure that would cost you MUCH more) Besides, I can't but agree with markhutch saying that Google has a completely customized copy of smth that used to be Oracle/MySQL/Posgres/etc.. (of course, I don't know what is their system based on) And one more thing, the problem with the speed is not mainly related to the number of CPUs they use (though, I bet they use A LOT of CPUs), but to the algorithms they use and the way they organize their data
I know it is just not the CPU that speeds it up, but I guess they must have alot as that is a very important part of the speed. Also they way ther organize their data, use indexes are also very important.