plzzz tell me the difference web scraping using google API and the regular scraping.. which one is more efficient???
Although post is old, I can reply with answer for those who don't know difference: Scraping: - get many results - parse manually /w some script - in about 10 days or sooner get blocked by google, which is pretty bad if you need that for a website Google API - get up to 64 results, 8 per page - up to 1000 requests per day per IP - structured response, easier to code Generally, if you're up to doing some useful white hat stuff, API is way to go.
nosf009: Thx for the reply! Have you used the Google API lately? I mean are you sure you can get 64 results?
I'm using API on few of my websites, if we're talking about web search api - that's what I suppose this thread was about. You can get up to 64, yes, 8 per page.
Well, according to my view, If you really want to scrap the web, then avoid API.. If you want to scrap, say 10k urls within 24 hrs then you cant do it with API, and if you do scrapping regularly then Google might ban your IP So, the solution is proxy! Use a proxy and start scrapping.. note: too much of anything is not good! Be within the limit!
nosf009: I've looked on Google's API pages and only found the AJAX search API, which is not really good for scrapping: http://code.google.com/apis/ajaxsearch/web.html Could you give me more details about the API you're using? Or even better maybe send me an example code or even your code if you don't mind? Thank you very much!
tony.blue, only the AJAX API is available now. Google's proper Search API was discontinued several years ago. I guess they didn't want any competition. nokimchen is right, Proxy and Scrape, though Google is pretty sharp with the Proxy IP Addresses.
I don't understand it. nosf009 says Google API scrapping is white hat, but then why do we have to use proxies?
Use proxy ips if you go the regular braking route but I definitely suggest going the cool way of using their API and possibly getting more leeway as to how much API usage you get by making a popular app