What i wanted to do was to estimate how much it would cost to make search engine of googles size ... and i stoped when i calculated 450,000 servers *$1000 each its about half a billion dollars, for now i don't have that cash so i'll calculate how much it would take to make slower but still working search engine with index of googles size so 22,3 bln pages 30KB each page 4*155 Mbps line -$74,800/month 2 java programers- $6,000 a month 1TB of HD per server 643 servers for all sites -$643 000 one time cost 200W each server $0.09/KW/h $13,694.35 monthly for electricty so how much time: 3 moths of programming= $18,000 lets sat that programmers will do the cralwer virtualy then its: 3 month of crawling: (3*$74,800)+(3*$13,695)=$265,485 cost of hardware-$643 000 265,485+643,000+18,000=926 485 so adding renting cost and other stuff it might might be about $1,000,000. And thats achivable. Please feedback.
Your missing a lot. Fair enough you have indexed all pages at an average of 30kb, now how bout making it usable? you need a few more servers to hold tree structures so you can actually search your data. Also you probably have to build your own DBMS, as mysql and postgress wont cut it with that much data. And your own network file system.. And your own crawler... And your own algorithms... Pierce
Here's the thing...there were dozens of search engines in the past with all of these massive financial and technical (hw, sw, manpower) resources, etc. The reason Google rose to the top is because of its algorithms. Google was sucking on its mom's tits when AltaVista and Yahoo were making big names for themselves. Now it's Altawho, and Yahoo in 2nd of course.
dont think of google only. we can change the algorithms and can invite people to index thier sites again. are you up for making a new search engine? HERE I AM?
For the crawler, okay you have built it and costed it out. But you have to crawl 100,000 pages a second or so before you can keep up with googles index freshness, so you need at least 5 dedicated servers for that alone pulling 20k pages each per second. DBMS and network file system, people in the know charge $120,000 a year and dont get their hands dirty with code, they come up with ideas, then your talking about 4/5 c/c++ coders for the NFS and DBMS each. Youll need expertise in file storage, distributed systems, database designers, network experts(who know spanning trees, redunancy, fallover). Also you have not budgeted the cost of switches to connect 643 servers together. Cisco catalyst switches are about $13,000/switch for 48 ports. so you also need about 25 of them. Cabinates to hold the racks and switches. Cooling systems. Thats just what I can pull off the top of my head. Im sure theres a lot more required. Pierce
Is this a real question, or is this hypothetical for a school project? Your indexes will take up AT LEAST triple the space of all your cached pages, so triple the storage. Plus, you'll need redundancy for your storage ... so triple the storage again. Now you're up to ~5,000 servers. Assuming you lease the servers, space, etc., the best you could expect is $1,500,000/month in data center costs, all inclusive. Another thought is your programming time. To create a search engine that someone would want to use, expect 10X the programming effort you've outlined. Plus you'll need additional programmers to monetize your site (e.g. something like Adsense). Finally, you'll need some way to market it. You don't want to burn through $1,500,000 per month in without revenue. Anita
I have a few points. First your numbers are way off. Google makes billions per month and it's daily operating cost is several million dollars per day. If you truly want to take on Google, take that $1,000,000, add for $4 or 5$ million to it, create a start up that will build an algo that blows Googles out of the water, then get bought by Yahoo or MSN.
"2 java programers- $6,000 a month" You can higher java programmers for $3000/month each? Even if you could get some offshore developers, they are not going to have the expertise to develop a dynamic search engine (if they did they would be making a lot more than $36k/year) You will need an IT staff to manage all your 5000 servers, even with 3 full time IT people they would be completely overwhelmed trying to manage 1000 servers. Next you need developers to work on your algorith and caching and DB management - figure a minium team of 15 developers - your cheif architect and team leads will probably be looking to make 6 figures and you will need at the bare minimum 3 senior tech people. SQL people arent cheap, you will also need some jr people to do interface programming, web standards, etc. etc. Plan on selling ads on your site? Sales reps, customer service reps, tech support. Bare minimum of say 20 total people, salaries will cost you close to 100k/month - now with all that you are going to need accountants, HR people, oh and somewhere to put them - you could probably cram 20 poeople into about 4000 sqft of office space if you made them all sit in cubicles (this is sure to piss off your senior developers, so plan on getting them their own offices) better plan on more like 6000 square feet and that will still be tight, but in my neighborhood that goes for about $7000/month for class b office space - you can get class c office space cheaper - probably $5000 a month - a phone system, 20 extensions and a pbx = $8000 installed, then if you can get a nice deal on service (we get 5 lines and a T1 for $500/month) for what your doing expect you need at least 10 lines and 2 T1s so $1000/month. I can go on if you want but I think you might be getting the picture. Of course apple was started ina garage, and two guys in college made google (with the resources of the Stanford computer lab of course)...