Can you be more specific? any html output by php can be viewed from nearly any browser with 'view source', or using curl in php. Unless you meant something else...
I can get html source with "file_get_contents" in my php script, but if page have java script that hide some information, like email, phone number, location... than it output something like   and not the real thing. So, if I turn off java script than I don't get anything. I have google it around and found add on for mozilla to "view generated source" from webpage and this thing work, it give real address. Also javascript:'<xmp>'%20+%20window.document.body.outerHTML+%20'</xmp>' in IE work, but I would like to do this in my PHP script. Thanks
As far as I know, this is not possible. Javascript runs on the client instead of the server. What you could do is reverse engineer the Javascript code which encodes theses hashes and decode the hashes with a PHP script. But this can be tricky job. I'm not trying to poke my nose in others business, but information like email/phone # is encoded for a reason (SPAM!). Remember that in some cases, scraping is illegal.
Yes, but I have noticed that with Crome Inspector I can see this, but I would like to do with php this thing.
I guess Chrome first runs the Javascript before it displays in the inspector. Just check the URL with something like cURL or the Lynx browser, which display websites in - real - plain text.
to do that you need to fetch the url via curl through the listener url. if fopen on remote urls is allowed, that would also work.
Just like Safari and Firefox , javascript gets run when the document is ready. PHP especially with curl or file_get_contents (which some servers don't allow) cannot execute javascript and such, as a result it will only see what the initial code looks like. This is of course why its important regarding SEO especially if you use a lot of Ajax.
Yes, I'm trying with curl and file_get_contents, but I still get changed text. Here is exemple: http://yellow.local.ch/en/d/Stabio/...psFf8HGW7eDCw?what=luigi&where=Ticino+(Canton) As human I can see the email, but with normal "view page source" it becomes ; . With firefox addons or chrome inspector I can see <a href=mailto:xxx@xxxxxx.com>xxxxx</a>. So any suggestions, how can I do something like this with php or how can I connect my PHP script with chrome inspector and than from chrome inspector to scrap the real content. Thanks
So basically you want to few the HTML code of a web page? If so, you can either use file_get_contents() or cURL to fetch the page, then use htmlentities() to parse the page as HTML code.
So basically I want to grab email address. It doesn't work with htmlentities(). In example above, when you visit webpage you can see that email is : farmacia_pestoni@bluewin.ch, but when I use curl or file_get_contents() instead of email address I get  . Also, I try to turn off java script, but than does't get anything.
er, you won't experience the page as a browser with javascript does, you will only get the serverside generated source. obviously, further modifications to the rendered html can be applied via JAVASCRIPT. for instance with the email, it looks like it may be escape() / unescape()'d in javascript in order to obtuse it from bots. my own "mail me" links look like this in the source code: <a href="mailto:" class="mailLink" data-user="christoff" data-domain="gmail.com" data-subject="link exchange">mail me</a> HTML: I then concatenate the parts through javascript so the user can click it w/o a problem. it is something similar to that being used (http://fragged.org/masking-your-email-address-from-links-to-prevent-data-capture-by-bots_522.html). <span class="obfuscml" title="hc.niweulb@inotsep_aicamraf">hc.niweulb@inotsep_aicamraf</span> -> looks like the parts of the email reversed - hc. -> becomes .ch - so farmacia_pestoni @ bluewin ch - you CAN write a parser for this in PHP - look for spam class-"obfuscml" and grab the title="" then split it in parts, reverse them and concatenate them
It was in front of my eyes, but I couldn't see it. Yes letters are in reversed order. Always is simple, but you have to figure it. Thank you man, now I can do it without any problems. Thank you a lot. Regards