Been thinking of writing a little ebook about screen scraping using PHP. Just wondering if there would be any interest on this topic.
I believe there would be interest in screen scraping, but perhaps not so much with PHP. I've played about with it before, and the problem I had was that a lot of hosts disallow certain things such as DOM, memory usage etc. If you have a way of doing this preventing large memory usage, I'd be interested in learning!
I never had any problems with screen scraping using php. I am on a shared server and I have scraped sites with more than 1 000 000 pages. Never use dom always use preg_match_all and explode. Might give private tuition if interest is there.
If you are heavily relying on regex for screen scraping you are using the wrong tool for the job and probably could benefit from reading someone elses tutorial on screen scraping. Use DOM to parse data and only use regex to clean up or match specifically...