I have wondered this for a while, does any1 know a good script/application that allows you to make your own power custom search engine?
Go to sourceforge and do a search there. I played around with phpDig for a bit, there's an english version of the site somewhere but this is all I could quickly find: http://phpdig.de/
There's two classes of search engines. The tiny ones that don't index much, and the 'power' ones that will crawl millions of pages. For smaller stuff, htdig as noted above is one. There's another older one who's name escapes me - aspsearch? something like that. Done by some russians, apparently languishing and unsupported but I've spoke to folks that like it. These products will not index huge numbers of websites so they're more suitable for a niche search engine IMO. If you're looking to index a few hundred thousand pages these are the best products. More than that and the software won't work. For a full scale search engine the only OSS product I know of is nutch. I use this in a few places and it works well. It's not anywhere near as easy as php scripts to set up and run but it will crawl and index on a large scale. Certainly it will do 50 million pages at least. If you take the second route, your two biggest issues will be servers (you won't run this on shared hosting) and crawling the web. I throttle my crawls back, but I can and do open up a 40mbs connection and fill it, wide open, for days at a time when I'm crawling. You could get by on a 10mbs connection - but you'd better have a good host who'll let you download enormous volumes of data. Once the data's downloaded, it takes a lot of horsepower to index all of it. I've got a heavy duty server that does that job, and it uses every bit of it.
Lucene is the search engine included in nutch. Nutch adds on the crawler and a few other things that take lucene from straight search to a what most would consider a full search engine.
I've seen the russian search code also, looked great - if not a bit daunting...shame I lost the link though...
If you are not looking for a proper search engine you should find some out there especially if your pages are few. A while ago I decided to put a search engine on my site but got disappointed because I expected to find a script that gives suggestions and not just a plain text search. For example lets say somebody enters "choclate", it should suggest "chocolate". The best option I found was "Google search API" but the problem was that google has to have all the pages indexed otherwise it only returns results from indexed pages.
how much will it cost you to have this search engine and grow it big like google. i want to have one for Africa. kingsley
If you want to build an empire like google, you should build from scratch, because sooner or later you'll stumble upon bugs that urge for immediate fix. So it's 3rd-party component you're using, and even it's open-source, you'll have hard time fixing it or asking the vendors to fix it. You fix it yourself, you'll have to learn the framework, or if you ask the vedors to fix it, you'll have to wait for unspecified, unpromised timeframe. I'm not saying 3rd-party engines are bad but just not for mass-crawling operations especially for commercial use. If you just want to build small search engines, they are probably fine. Ok, building SE from scratch requires capital and skill. If you cannot code the engine yourself, you will have to throw at least $2,000 to hire freelancers to build the engine, and many thousands more for maintanance, not to mention thousands a month for testing and hosting. In brief, unless you have a very good reason to run a SE, don't. If you just want a search functionality added to your website, search for "google search box" or "google site search integration"