preg_match help

Discussion in 'PHP' started by amorph, Sep 19, 2007.

  1. #1
    I have this php code that's supposed to extract the url if it finds a match in the anchor text:

    function get_url ( $string )
    {
    	if ( preg_match ( '/<a[^>]+href\s*=\s*["\'](?!(?:#|javascript\s*:))([^"\']+)[^>]*>(audi)<\/a>/si', $string, $link ) )
    	{
    		return $link [ 1 ];
    	}
    	else {
    		return FALSE;
    	}
    }
    PHP:
    well....it works on the following example:

    get_url('<a href="asdasd">audi</a>')
    PHP:
    but not on this one:

    get_url('<a href="asdasd">audi something here</a>')
    PHP:
    anyone good at regex?
     
    amorph, Sep 19, 2007 IP
  2. MakeADifference

    MakeADifference Peon

    Messages:
    476
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Yes, because you have audi hardcoded in the anchor text

    why don't you use
    if ( preg_match ( '/<a[^>]+href\s*=\s*["\'](?!(?:#|javascript\s*:))([^"\']+)[^>]*>(audi)<\/a>/si', $string, $link ) )

    if ( preg_match ( '/<a[^>]+href\s*=\s*["\'](?!(?:#|javascript\s*:))([^"\']+)[^>]*>(.*?)<\/a>/si', $string, $link ) )

    It should work.
     
    MakeADifference, Sep 20, 2007 IP
  3. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I have this php code that's supposed to extract the url if it finds a match in the anchor text:


    that match is audi...so I need a function that will extraxt all the url's that have audi somewhere in the anchor.
     
    amorph, Sep 20, 2007 IP
  4. MakeADifference

    MakeADifference Peon

    Messages:
    476
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Oh I see
    try this

    if ( preg_match ( '/<a[^>]+href\s*=\s*["\'](?!(?:#|javascript\s*)([^"\']+)[^>]*>(.*?)audi(.*?)<\/a>/si', $string, $link ) )
     
    MakeADifference, Sep 20, 2007 IP
  5. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    works perfect..thank you man!
     
    amorph, Sep 20, 2007 IP
  6. krt

    krt Well-Known Member

    Messages:
    829
    Likes Received:
    38
    Best Answers:
    0
    Trophy Points:
    120
    #6
    MakeADifference's solution incorrectly matches "audio", "auditor", "Saudi Arabia" and probably many others.

    I would use:
    /<a[^>]+href\s*=\s*["\'](?!(?:#|javascript\s*:))[^"\']+[^>]*>(audi|.+?\Waudi\W.+?)<\/a>/si
     
    krt, Sep 20, 2007 IP
  7. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #7
    well...your's does not work...I don't know why...
     
    amorph, Sep 20, 2007 IP
  8. krt

    krt Well-Known Member

    Messages:
    829
    Likes Received:
    38
    Best Answers:
    0
    Trophy Points:
    120
    #8
    Sorry, use this instead:
    /<a[^>]+href\s*=\s*["\'](?!(?:#|javascript\s*:))[^"\']+[^>]*>(audi|.*?\Waudi|.*?\Waudi\W.*?|audi\W.*?)<\/a>/si

    Or a slightly simplified version:
    /<a[^>]+href\s*=\s*["\'](?!(?:#|javascript\s*:))[^"\']+[^>]*>(.*?\Waudi|audi)(\W.*?<\/a>|<\/a>)/si

    I tested to make sure this time.

    If, heaven forbid, it doesn't work, then can you please tell me what string you tried to match it against?
     
    krt, Sep 20, 2007 IP
  9. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Thank you krt.

    Works now but with a small problem. It doesn't gets the link out...but the whole code

    instead of : http://www.google.com


    it gets: <a href="http://www.google.com">audi</a>

    Hope I'm being clear with what I need and not confusing you.
    Thank you.
     
    amorph, Sep 20, 2007 IP
  10. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Any updates @krt :D
     
    amorph, Sep 21, 2007 IP
  11. krt

    krt Well-Known Member

    Messages:
    829
    Likes Received:
    38
    Best Answers:
    0
    Trophy Points:
    120
    #11
    function get_url($string)
    {
        if (preg_match('/<a[^>]+href\s*=\s*(["\'])((?!(?:#|javascript\s*:))[^"\']+)\\1[^>]*>(.*?\Waudi|audi)(\W.*?<\/a>|<\/a>)/si', $string, $link))
        {
            return $link[2];
        }
        else {
            return false;
        }
    }
    PHP:
     
    krt, Sep 21, 2007 IP
  12. amorph

    amorph Peon

    Messages:
    200
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Wow...perfect. Thank you very much.
     
    amorph, Sep 21, 2007 IP