It's quite simple to explain what I want. (it's never simple I know) I want this little tool grab some portion of words and links from specific web sites (15-30 similar sites), those words/links are very easy to track and grab. Tool need to be able to. 1. Have support for proxy because most of those web site will ban singe IP if start opening so much pages. 2. Delete/Ignore duplicated words/links 3.Export Those content in .txt format 4.Able to know what url from specific web site it's already done/grabbed so that doesn't grab the same words/links again. That's it for now, actually I would like to get some rough estimation on what would this cost so I can add more functions. In future if I want to add more function (after we done) of course I'll pay extra to add those. Thanks
I am already doing this with my scraping engine script (1-4). Can be viewed at my site http://proxyfreak.org/proxy_site_list of how it scrapes pages and inserts the words into a database for the URL's to display a dynamic page of scraped content. PM me a budget that you are willing to pay for such a scrape engine.