My project scenario is this: I am downloading the contents of a web page. So my output will be the source code of the specified file. My next step is to retrieve the text part alone from that. Hence i thought of using stip_tags(). As this will eliminate all the tags and give only the text. But i have a doubt here. For stripping off the tags, we use variable and then use that variable inside the function. Now that i wanted to give the ouput of my downloading the source page straight away to the strip-tags and finally get only the extracted text as the output in my window. How should i proceed now? Can you please help me. And also i have the java code to download the source code of a web page. I need a php code to download the source code. Thanks.
You can take a look at this http://gr2.php.net/fopen and http://www.php.net/manual/en/features.remote-files.php Code (markup): (about the fopen command) take care
Alternatively, to directly get the contents in one go, look at http://gr2.php.net/manual/en/function.file-get-contents.php Code (markup):
I saw this great script to download pages and get the text: http://ubuntuforums.org/showpost.php?p=4782850&postcount=880 it is in perl, but all you really need is to use the "wget" and "lynx --dump" combo ;-) see php's system command