I am looking to make a news harvester/content harvester. It will e.g. every 10 minutes check for updates on a list of sites and save the updates e.g. new news headline in a database on my server. What would you have developed the news harevester in? What programming language is most suitable? Java? Php? Ruby on Rails? Would be great if people could help me with their views on this! Thanks.
Are you looking to scan the site and detect changes, or to harvest RSS feeds? How you do it depends on what you're doing. RSS is XML, so parsing it out is trivial in PHP.