Hello, what is actuall the best commercial or non comercial webcrawler/spider to start a Google like site with? THX for your help
The very first thing I think you should invest in is some good database servers with a redundancy and automated caching and backup plan. If you have all of that covered, then I would check into any open source engine to start out with. You can find several at "the search engine list dot com". I personally like the Sphinx, but you can do lots of ***** stuff with Yacy on board.