Hello, Please if you can help me , give me some clue... There are 2 sites wich 1- There is not any RSS or XML files there 2- There is RSS file but, there is not text tag inside RSS, only short description I need a tool with help of it to grab content from both sites( description, text, media, date etc). I want to set up a cron job , and I want that job to update my tables from those sites without my help everyday. Please help me ....
I know what you mean. You want to steal their content and then post it onto a blogger/wordpress account with your adsense on it? Its not there. But if you get a coder to code it for you, there will be one then. I think it should cost you around about 200-500USD to get a custom coded 'crawler' script.
There is no one solution fits all for site scraping.. Ive coded a dozen (give or take) and every site requires different finness to get it right. And if they change layouts, you have to recode it.. sometimes from scrtach (worse case). Luckily with CSS it's getting easier (love those DIV tags). -Jason
If you can get away with just meta tags it would be 100 times easier to program. So many people have malformed HTML holding their site up, and make regex substring routines grab tags you don't want (yay!)
warzone - Yes some thing like that. But I am going to publish with active link to those sites But what about automated bloggers websites ? How do they build automated blogs ? DopeDomains - Can you help me with codes ? There are 2 websites for now. I am tired of copy pasting content from those sites. I want to automate that process. Can I PM you ?