There are numerous sources of free data on the web. The problem is that what you don't pay for, you have no control over. In the past few days alone: . Amazon USA, for no perceptible reason, changed the format of the web pages it uses to give "reader reviews" of products. That required substantial rewriting of the page of the "Freebie" package that presents such reviews for books it lists (meaning that if you're using that package, you need to be at v. 1.01 or you will few or no reader reviews). Many Associates were very unhappy. . The International Monetary Fund, with no notice on its web site, changed the format of its downloadable currency-exchange-rates data file (in fact, its pages still refer to the old format), so that anyone using the "Rates" package now needs to be at v. 0.41 or will be presenting seriously stale rate data. (On the other hand, the benefit is that the rates are now updated in real time, rather than once daily.) . The weather XML service at addresses.com showed that it can tank for hours at a time. Maybe I should have foreseen that, but in any event I had to add a graceful exit for when there's no data at all (that's v. 0.32 of "Weather"). I know, I know: at the price, how can one complain? Well, with Amazon, at least, we are supposed to be "business partners"--but Amazon's answer would be "we don't support page scraping" (to which the Associates' replies have been, "Well then give us that in XML, as you long ago promised you would give us anything that can be had over the web pages"). And so it goes . . . .
I have to agree. Despite all the things that Amazon offers through their API, I still find there are queries I want to do that I can't through the API. So, I've created screenscraping mechanisms as well. Perhaps someone should develop the Amazon API addendum, which performs these queries through screescraping as long as Amazon is not going to provide it.
Just so. Just for starters, the only way you can deal with Amazon Canada and Amazon France is by way of screen-scraping, as they are not yet on AWS (and, the way it looks, won't befor a while, as A has too many fires to keep putting out to spare time for much else). I posted a list on Amazon's forum, many months ago now (at least--maybe over a year) of ten really important data that could not be obtained from AWS, with the list derived from simple consideration of the things that a potential buyer would likely want to know about a product before actually buying. To my pleasant surpsise, a great number of other Associates heartily endorsed the list. To this hour, not one of those ten are yet available via AWS.
I have put (at least) 20 times more effort into the Amazon Associates program than I have into Google AdSense. I have received 20 times more cash from Google than I have from Amazon. Is the Amazon Associates program really one where we should be investing our time?