I am currently in the process of developing a mailing list, and doing it by hand seems very time consuming. I was wondering if there is any script available, preferably in php, that will search webpages of one site in particular for words that contain the @ symbol and then write them into a textfile. pretty simple design and would save me a ton of time. Thanks in advance.
If you tweak the following script, it will write the results to a text file instead of the screen and be able to run from a command line. I found it at: http://www.zubrag.com/scripts/email-extractor.php To do something thast searches an entire site, you would also need to extract URLs for the site's pages and search them. It would take more work and requires some thought to avoid requesting the same page more than once.
Thanks for the feedback, actually started using MailCrawl and it is amazing to say the least. Completely automated, having it running at 5000 emails/hour and it gets rid of the dupes/invalids automatically.
So, what kind of project are you working on? I cannot imagine why you would want to harvest 5000 emails an hour from other people's websites unless you are trying to get into the bulk email business.
Yeah this sounds a look like unsolicited mail which puts you in some potential trouble. Harvesting emails from a site isn't exactly double opt-in.
Well was just looking on building a list of potential buyers of a particular product/products I am able to target specific websites that will generate me a list of 'common minded' individuals, and I would send them an email informing them of the new product. The individuals need/want the product I am selling, so they would mostly be interested. I wouldn't be fishing or sending emails for any dubious tactics or anything like that, and it would be a 'useful unsolicited email' I sent several emails already and I had nothing but positive feedback, so I don't know if there is anything wrong with these marketing tactics.
There also unintended consequences from this marketing tactic. Your email will most likely end up being targetted by anti-spam programs and your domain may end up in some of the more important anti-spam URL blocking lists, resulting in long term difficulty sending any email messages that include that domain name in the body and/or header of the message. I receive over 40,000 emails per day from people, some of whom have the same attitude towards generating spam as you have. They all meet the same fate.