$string = "<a href= \"http://abc/\" >xjtu</a>"; echo $string.'<br/>'; $pattern = "/ href=.*(?!http:\/{2}).*/i"; echo preg_match($pattern, $string); Code (markup): Identifying all relative links in a html page that is not started with "http://", why is this regular expressoin not working? The test page outputs '1', but I expect it to be '0' because the $pattern should exclude all URL that is NOT started with 'http://', so it should NOT match a URL like 'http://abc/'. Anyone could help, plz? Thanks.
Simply work with an if() clause. Try this. <? if(preg_match('@^(?:http://)?([^/]+)@i', $string)) { $match = 0; }else{ $match = 1; } ?> PHP: I guess that'll work.
Thanks for the reply, it works! Though I was kind of hoping to modify those only in 'href' and 'src' using preg_replace() to add something before the original URLs.
I am using preg_replace() to convert all relative URL in 'href' and 'src' into absolute URL on a web page, that is to add 'http://www.site.com' immediately before all relative URLs like in this one: <a href="/wtf.html">wtf</a> into <a href="http://www.site.com/wtf.html">wtf</a> So are the images and stylesheets and so forth. I'm now STUCK at how to identify a relative URL that is not started with 'http://'. The QUESTION is, how to write the regex pattern to identify it so that I can use the pattern in preg_replace($pattern, $replacement, $string); I am rather new to PHP and regular expression, please, any help would be appreciated, thank!