Hi I am having some trouble with a regex, i want to get all the urls in anchor tags of a web page but from the example below i get script tags as well. All i want back is http://thisissomesite.com and not the html tags Could anyone advice on how to fix this as it is driving me nuts preg_match_all("/(<([\w]+)[^>]*>)(.*)(<\/\\2>)/", $result, $output, PREG_SET_ORDER);
Here is working code: preg_match_all('/<\s*a[^<>]*?href=[\'"]?([^\s<>\'"]*)[\'"]?[^<>]*>(.*?)<\/a>/si', $html, $match, PREG_SET_ORDER)) PHP: