I wish to start my own search engine like google.com but i do not wish to use google api or any other api for that matter, i want it to crawl the web for webpages, im willing to invest quite a bit of money in to this project. thanks
http://forums.digitalpoint.com/showthread.php?t=1047265 - My post related to this ( Check your private message. )
I just finished writing a multi-threaded forum crawler with a searchable database (see here http://forums.digitalpoint.com/showthread.php?t=1070936). I'd be interested in this project as long as you're willing to pay over $1,200. It's no easy task to make a crawler that runs fast, has low memory usage, and won't give you errors in the long run. I implemented extensive error checking and very low memory usage into my crawler. I've run it for the past week without an error or a crash. I can do the same for you, as long as I'd be coding it for linux or OSX.
I saw a tutorial somewhere telling how to make a simple search engine using CGI...pm me if you want me to search it again
Hey guys lets all get together and have a little discussion on aim or something since msn seems to be down atm
Ok, I can give you a meta crawler if you interested. You can save the index data anywhere i.e. to your local access database, to your web sites access database, to your site's mssql database etc. If interested, please PM me. I'm using it for my own small SE.
Writing the crawler portion for a project like this is probably the easiest part. You will still need to be able to provide results for user searches, that would be the toughest part. Suggest you look at Nutch, as a previous poster suggested. It is a Java API built upon Lucene. Lucene has been ported to PHP and is part of the newish Zend Framework.
Yup, the crawler is the easy bit. As i metioned in another thread, i have created my own "google-like" search engine. I have previously checked out most of the complete open source "search systems" and not found anyone i like. So instead i picked some good different systems for different parts and merged them to a working system perfect for my demands. Since i work as a progammer that is a bit of the fun. A good start is to figure out what exactly what the search engine should do, then you can figure out the software demands. "Google-like thing" is a bit too wide IMO.
a very good search engine script, based on sphider.Eu is spider-plus, i a thinking of installing it on a Adult domain to index a couple of thousand adult blogs. https://sourceforge.net/project/showfiles.php?group_id=214642µ Peter
Dont bother. First the amount of money needed to crawl will bankcrupot you. second why wouldnt people choose you over Google ?