Guys, i need your help. i've been trying to figure out this but nothing helps I am not very good with Regx. I want to grab the content (only the text to grab & date part )in the 2nd <div class="section"> div and put them in a php array. $html = '<div class="pad"> <div class="section"> <div class="section_title">Title text</div> </div> <div> <div><small>Info.</small></div> </div> <div class="section"> <div><a href="#">link</a> text to grab <small>12/2/2009</small></div> <div><a href="#">link</a> text to grab <small>12/2/2009</small></div> <div><a href="#">link</a> text to grab <small>12/2/2009</small></div> <div><a href="#">link</a> text to grab <small>12/2/2009</small></div> <div><a href="#">link</a> text to grab <small>12/2/2009</small></div> </div> </div>'; PHP: i am not using php5, so DOM is out of choice. thanks for yur help.
Not sure how others would do it.. but i would do something like this if the structure is the same every time. Else it would require dynamic parsing algorithms which takes longer to write and more time to test. $html = '<div class="pad"> <div class="section"> <div class="section_title">Title text</div> </div> <div> <div><small>Info.</small></div> </div> <div class="section"> <div><a href="#">link</a> text to grab <small>12/2/2009</small></div> <div><a href="#">link</a> text to grab <small>12/2/2009</small></div> <div><a href="#">link</a> text to grab <small>12/2/2009</small></div> <div><a href="#">link</a> text to grab <small>12/2/2009</small></div> <div><a href="#">link</a> text to grab <small>12/2/2009</small></div> </div> </div>'; // Skip first <div class="section"> $div1 = stripos($html,"<div class=\"section\">"); $html = str_ireplace(" ","",substr($html, stripos($html,"<div class=\"section\">",($div1+11)))); // Output array $out = array(); $intK = 0; //Strip away <div class="section"> and explode string into array 1 element / row $parts = explode("\n",substr($html, stripos($html,">")+2)); // For each row foreach ($parts as $value) { // Make sure to remove the </div> if(trim($value) != "</div>") { $out[$intK++] = $value; } } // Print out array print_r($out); PHP: The rest of it you will have to do on your own.
I'm sure solution provided by n3r0x worked fine to you. But since you asked about regular expression here is one: if(preg_match_all('~<div><a[^>]*?>[\s\S\w]*?</a>[\s]*([\s\S\w]+?)<small>([\d\s/]+)</small></div>~i',$html,$matches)){ echo '<pre>'; print_r($matches[1]); print_r($matches[2]); echo '</pre>'; } PHP: On your sample content, this code will product the following result: Array ( [0] => text to grab [1] => text to grab [2] => text to grab [3] => text to grab [4] => text to grab ) Array ( [0] => 12/2/2009 [1] => 12/2/2009 [2] => 12/2/2009 [3] => 12/2/2009 [4] => 12/2/2009 ) Code (markup):
Thank you n3r0x and Sergey Popov for your quick reply. Code works perfectly. However i am not sure why thisn't work when i apply to a bunch of HTML source. what i am doing is, trying to grab the recent status updates from the saved profile page. Recent statuses are in the 2nd <div = "section >" I really appreciate your help on this. Could you please check here shttp://ourweb.limewebs.com/ and let me know what is needed? thanks again