need some help here: i have a string with some html tags inside, like: visit this <b>GREAT</b> site at <a href="http://www.greatsite.com/greatstuff/somepage.php"> http://www.greatsite.com/greatstuff/somepage.php</a>!!! Code (markup): now i need the full url out of this string, in this case http://www.greatsite.com/greatstuff/somepage.php Code (markup): anyone good in regex? spent the last 2 hours looking for somthing like this, but couldnt find anything that works. this is what i have so far, "Regex Coach" says its correct, but i get a "Unknown modifier '/'" error: function geturl($string) { preg_match("/https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?/",$string, $matches); return $matches[0]; } Code (markup):
Those first two forward-slashes in your regex need to be escaped, the regex interpreter thinks that the pattern ends after the first one. changing it to :\/\/ should help things.
Try: preg_match_all("/<a[^>]*href=(\'|\")([^\'\"]*)[^>]*>/i", $file_contents, $link_array, PREG_OFFSET_CAPTURE); Code (markup):
It's even simpler if all your links have the structure you presented: $pattern='%<a\s*href="([^"]*)%'; $yourString = 'visit this <b>GREAT</b> site at <a href="http://www.greatsite.com/greatstuff/somepage.php"> http://www.greatsite.com/greatstuff/somepage.php</a>!!!'; preg_match_all($pattern,$yourString,$matches,PREG_PATTERN_ORDER); print_r($matches[1]); PHP: Regards, Adrian