It's basically a tool that strips an article from a given site. It will remove links, banners, ad's, popups, the need to go to multiple pages to read a story, etc. The idea is to make it so you can read an article or news story without having to deal with all the bullshit that comes with it. I'm sure you have all seen the URL shortener scripts. It takes a long URL and makes it small. I'm kind of going for the same thing here, only with news stories. Basically I want a user to be able to put a URL of a news story in the box and hit the submit button. After they hit the submit button, I need the script to fetch the news story / article and strip the text and related article images(if there are any) and re-post it on my domain giving it a new URL like "mydomain.com/story12345" or whatever. This may not make much sense, if you have any questions please ask me about it. Thanks a lot!
nice idea but there will be a lot of text that will have no-sence (like menu , footer , header , ..) same for images , if u get all images from a specific site and just post them ...
RSS does what you say but in a bit different manner to pull the details and keeps feeding in your site. Its almost the same I believe to ask.
try this... or something along these lines... $f=fopen("http://www.news.com","r"); $contents=stream_get_contents($f); fclose($f); echo strip_tags($contents); this will strip all the tags from teh content, however you will still get javascript. All you then need to do is run substr() on the $contents (right after downloading from the remote source) and only keep the bits between <body> and </body>
This is quite possible, although it will have to have specific processing code for each news site. Simply stripping the tags won't do a whole lot, as you are still going to be left with a bunch of links, navigational bars, etc. I've sent you a PM outlining what could be done. Also RSS feeds would work to some extent but they only show like the first paragraph of a story usually.
Not bad actually, considering you did that without even being hired. I mean, it doesn't work perfect, but it's certainly on the right track.
Yeah its only like 4 lines of code there is still some stuff to remove and to fix images issues (since links in source are usualy like this " ../images/img-url" should be "domain/images/img-url") and last thing is easy just insert in db and make mod_rewrite to have domain/article/id