aeccully i am researching how search engine really works if you build search engine what is you algo or flowchart? please help me i want to collect all search engine related docs. and lots of thinkg..please help me
You're own algo is something you have to decide for yourself. For instance, you might count the number of links to a site from every site that you've crawled, then do a keyword analysis on those links, and then do a keyword analysis on the on the actual site your examining. You might also take into account the age of the site, the IP of the site (for deciding country specific targeting) and whether the site *should* be updated regularly or not. Then you'd assign weight (or value) to the score of each of the above, and tot up all the scores. Hey presto! You got yourself an algo. You might then assign that score to that particular keyword for that site in a database. You could have another algo for actually computing how all those scores add up for a search term. For instance, in deciding the results for a one word search term, eg 'Maths' you might pick the hightest ranking site from your algo above for the keyword 'maths'. For a two term search, eg. 'Maths history' you'd need to compute what weight to assign to each term 'maths' and 'history' and then search your database based on that. In reality doing all of the above could take thousands of lines of code. You might have one function to deduce the 'score' (or in Google terminology the PR) of the site/page, but its getting all the relevant data is the hard part.
what type of server user for if i build search engine...or you can say what type of server google, yahoo and BIng use?
What type of server? How long is a piece of string? perhaps you could ask yourself first how much you want to index, how much data you want to keep and how many searches will be performed per day.
Smaller home grown search engines are typically implemented on Unix systems since it come out of the box with so many robust programming tools as part of the Operating System. But larger search engines like Google, Bing, Yahoo, Ask, etc. likely all use their own proprietary operating systems/file systems. I know for a fact that Ask does. They have their own OS/File system because of the shear amount of data and the speed at which they need to access it. Commercial Operating Systems just can't handle that volume of data and meet I/O speed requirements of large search engines.
i want to research whole info if i have start big search engine like Google, Yahoo and bing then what type of resource i need .....man power...server....any other..list out here please
it will going to be a huge flow chart..it will be good if u ask step by step process you had applied so far
if i want to build search engine like google..yahoo and bin then what type of book i have to read..mins how can i code this..what's requirement.. is any book available in market?
Yes. Its called 'Stop repeating yourself on forums and start doing some of your own research'. The tag line, if memory serves me right is 'Post back what you've figured out so far rather than repeated requests''. You're simply continually asking,asking,asking. Why not post what you've done so far. Forums are not a one way system. If you really want to build/research a search engine like Google, and cannot figure out what books to read, its a lost cause. Stop now before you do yourself permanent harm.