is there a simple way to extract the anchor text of links using regualr expressions in php like I am looking for basically this pattern "target='_new'>Anti-War Protest In Texas</a>" from a bunch of text chunks I want to extract the "Anti-War Protest In Texas" part I could like split the string a fews times to get it, but is there a quicker way with regex I was looking at php.net there is bunch of regex stuff that returns true or false but that not what I want
found this, not sure if this is exactly what you are looking for preg_match("/<a.>(.)<\/a>/", $matchText, $temp); PHP:
try |(<a[\s]+[^>]+>)([^</a>])(</a>)|i The \s is important, otherwise it will match <abbr> and any other xhtml tag starting with a. Didn't test it, but it should be working.
preg_match("|(<a[\s]+[^>]+>)([^</a>])(</a>)|i", $link, $matches); then $matches[1] will contain your anchor
It's not even going to match <a href=\"test\">abc</a> This expression will work on all one-line anchors <a(?:[ \t]+[^>]*)?>([^<]+)<\/a> J.D.
One char missing, sorry... preg_match("|(<a[\s]+[^>]+>)([^</a>]+)(</a>)|i", "<a href=\"test\">foo</a>", $m); print_r($m); And the anchor is $m[2], not $m[1]
There's more than a char missing in this. Go ahead and give the anchor I quoted a try (the one with abc). You clearly don't understand what square brackets or parenthesis are for. J.D.
bah-- I did, just replaced "abc" with "foo"!#!@#$ |(<a[\s]+[^>]+>)([^</a>]+)(</a>)|i is separator (match1) (match2) (match3) separator case_insensitive - first parenthesis: match <a followed by any number of blanks (\s matches blanks and tabs), followed by any character but > - second parentesis: match anything but </a> -- the anchor - third parenthesis-- match </a> Second parenthesis matches the anchor, which is $matches[2]. I believe I do understand how regex works...
No. Square brackets mean "any of the listed characters" or "none of the listed characters" if used with ^. So, this [^</a>]+ says "one or more of any character except <, /, a or >". On top of that, why would you put parenthesis around everything? What's the point of capturing </a>? J.D.