I have created a new crawler that crawls selected websites and tries to list the best news (using my own algorithm on a few 100 factors). Check it here - Latest News Currently am indexing the sorted data using wordpress and the site is powered with more than dozens of servers (too much load to handle real-time data).