little help with preg

JEET Notable Member

Messages:: 3,832

Likes Received:: 502

Best Answers:: 19

Trophy Points:: 265

#1

Hi,
I'm trying to extract the URL from an anchor tag using preg_match_all, but it seems to be completely neglecting the matching part after the URL.
Here's the preg_match:
preg_match_all( '/(href)(\s)*(=)(\"|\'|\s)*(.*?)(\"|\'|\s)*(>)/is', '<a href="http://localhost/?w=title_quote" title=some > localhost2 </a>', $s );
preg_match_all( '/(href)(\s)*(=)(\"|\'|\s)*(.*?)(\"|\'|\s)*(>)/is', '<a href="index.php?w=title_quote" title=some > localhost2 </a>', $s );
Code (markup):
They both give:
the URL along with:
" title= etc.
I just want the URL...

Could someone please rectify the mistake?
Thanks

JEET, Sep 29, 2011 IP

Alex Roxon Active Member

Messages:: 424

Likes Received:: 11

Best Answers:: 7

Trophy Points:: 80

#2

A few heads up. Anything in parenthesis () you're actually retrieving the subsequent match of - you have parenthesis all over the place, which is why $s is returning so many results. Another one; you're properly escaping double quotation marks ", so congrats, but not single quotation marks The string itself is encased with single quotations, so \' is essentially just ' in your pattern. You need it as \', so \\\' should work. Finally, your pattern needs to be optimiesd. The following should work:
preg_match_all( '/href\s*=[\"|\\\'|\s]{1}(.*?)[\"|\\\'|\s]{1}[^>]{0,}>/is', '<a href="http://localhost/?w=title_quote" title=some > localhost2 </a>', $s );
PHP:

Alex Roxon, Sep 30, 2011 IP

Technoslab Peon

Messages:: 46

Likes Received:: 3

Best Answers:: 3

Trophy Points:: 0

#3

I generally use this for such purposes.

http://simplehtmldom.sourceforge.net

Good luck.

Technoslab, Oct 1, 2011 IP

Foxtr0t Peon

Messages:: 39

Likes Received:: 1

Best Answers:: 1

Trophy Points:: 0

#4

Yes, generally it's a much better idea to access such things through DOM, for example using Zend_Dom_Query:

http://framework.zend.com/manual/en/zend.dom.query.html

A pattern for capturing the href attribute would be something like this: /href="([^"]+)"/

Foxtr0t, Oct 3, 2011 IP

gvre Member

Messages:: 35

Likes Received:: 6

Best Answers:: 3

Trophy Points:: 33

#5

Try the following code

$str = '<a href="http://localhost/?w=title_quote" title=some > localhost2 </a>';
$pattern = '#href="([^"]+)"#si';
if (preg_match_all($pattern, $str, $m))
{
        $matches = $m[1];
        print_r($matches);
}

Code (markup):

gvre, Oct 4, 2011 IP

Javed iqbal Well-Known Member

Messages:: 445

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 103

#6

you solved other wise contact me

Javed iqbal, Oct 5, 2011 IP

Log in or Sign up

little help with preg_match please

JEET Notable Member

Alex Roxon Active Member

Technoslab Peon

Foxtr0t Peon

gvre Member

Javed iqbal Well-Known Member

Log in or Sign up

little help with preg_match please

JEET Notable Member

Alex Roxon Active Member

Technoslab Peon

Foxtr0t Peon

gvre Member

Javed iqbal Well-Known Member

Useful Searches