Hi all. I had this code and it works perfectly if i want to extract all the urls from a page. <?php $url = 'http://site.net'; $m= file_get_contents ($url); preg_match_all ('/url="(.*?)"/',$m,$match ); print implode("<br>",$match[1]); ?> PHP: Now, i want to extract only a specific div or class from a page, like this: <div id="links"> <li>...</li> .... <li>...</li> </div> HTML: I need the script to output just: <li>...</li> .... <li>...</li> HTML: I searched google how to do that but after 6 hours i decided to quit and ask here. Can anyone help me ? I don't know how to modify that 3rd row.
Thanks danx10. This really works, but it output the content from the first div. I need to get the content from a specific div.
$specific_div = 'links'; preg_match_all('#<div\s*(?:id|class)\s*=\s*"'.preg_quote($specific_div).'">(.+?)</div>#is', $m, $match); PHP:
Seriously, look into the PHP simple DOM library. I used to use regular expressions for stuff like that, it was crazy. http://simplehtmldom.sourceforge.net/manual.htm
Thanks danx10. You're magic. I don't know how the f..k you did that because i don't understand nothing from that bracket. Cheers! P.S.: Can you make it to work even if the code is like this ? P.P.S.: And could you do it to find something without considering the tag? I mean, if i want to find something but i don't know if it's a div, ul or ol, only the id name.
$specific_div = 'links'; preg_match_all('#<([a-z]+).+?(?:id|class)\s*=\s*("|\')'.preg_quote($specific_div).'\2[^>]*>(.+?)</\1>#is', $m, $match); PHP: Due to the changes within the expression, your $match array will change, theirfore you'd need to rectify $match[1] accordingly - what your after is the 3rd set of matches. Also its untested.