I'm in the process of building a community directory. Are there any "Open Source Search Engines" that will help weed out some of the bad sites, ones that use link farms, cloaking or other unethical ways of getting better rankings on normal SE's
IMHO if you are building a directory the whole point of it is human judgement...to place sites that may not rank well in the search engines. You shouldn't let a third party entity control your editorial decisions. Plus if you have lots of links you are bound to link to some sites that violate search engine guidelines. So long as that ratio is normal across other good sites on the web I don't think it hurts you to have a few links to sites that the search engines consider spam.
Take a look at rollyo.com it allows you to create a custom search pool of your favourite sites. In my case I could specify some busy forums (eg DP) and users could search for info knowing the answers are from credible sources. Blog: Rollyo delivers a new web search
I also use Rollyo on my blog. Why build a search engine when you can index the pages you want to Rollyo. Rollyo also can search the entire web and is yahoo powered.
I just checked out Rollyo and it's quite nice. I would prefer to use Google, but their horrific looking search box won't look so hot. Is there a way to monetize search with Rollyo?
hrm well with out trying to promote our engine there are some engines that offer an xml feed of their results... as for surch, we are coming out with one that you can have our results on your page as if it were your own engine with your logos etc and you also can have our PPC which you would capitalize off... also its impossible to manipulate our results as they are the combination of 15 engines and then sorted based on relevance...but again there are others that do this (xml feed stuff) however most of the "good" ones will require that you have seriously substantial traffic before they would consider you. you can PM me and I can explain more.
Surchin, is it possible to arrange that, let's say, on my web site I have firstly listed results from my own directory and then results from your search engine? Also have you developed any script that will enable geographic/local search? I liked your search engine look and especially big thumbnails beside the search results, good job!
This interests me, but does it return "all of the web" or just my sites. I'm looking to generate results from my site, store, forums and wiki. With subdomains, different platforms, my only option is Google Site Search or Rollyo. Site search is working quite well, but I want something that looks better and less generic.
If you are trying to build your own search engine, i.e. create the index internally, a popular open source search API is Apache's Lucene. It offers a bunch of nice features, such as stemming, field boosts etc.