Extracting href's from page that contain certain text

Discussion in 'PHP' started by mancade, Nov 26, 2007.

  1. #1
    Hi,

    I would like to extract from a string the value of all <A HREF> nodes that contain the text ".MBR". Can anyone suggest how to do this?

    Example
    --------

    String is:

    $str = "<A HREF=/location/$file/filename.MBR> <font size="1" face="Arial> <A HREF=/location/$file/filename.htm><A HREF=/location/$file/filename2.MBR>";

    Desired result is:

    "/location/$file/filename.MBR" and "/location/$file/filename2.MBR"

    I think that preg_match_all and preg_replace are probably the way to do this, but I don't know what the syntax for the patterns should be? Is there are good resource for describing the syntax of patterns?

    Thanks in advance,

    Ade
     
    mancade, Nov 26, 2007 IP
  2. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #2
    
    preg_match_all('~\s+href=["\']?(.+\.MBR)["\'>]~si', $text, $matches);
    
    print_r($matches[1]);
    
    PHP:
    Untested but should work.
     
    nico_swd, Nov 26, 2007 IP
  3. mancade

    mancade Peon

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Thanks for your response nico_swd.

    I tried your suggestion and the result was:

    Array
    (
    [0] => /location/$file/filename.MBR> <font size="1" face="Arial> <A HREF=/location/$file/filename.htm><A HREF=/location/$file/filename2.MBR
    )

    Any idea how I can get the following result in an array?:

    "/location/$file/filename.MBR" and "/location/$file/filename2.MBR"
     
    mancade, Nov 26, 2007 IP
  4. Barti1987

    Barti1987 Well-Known Member

    Messages:
    2,703
    Likes Received:
    115
    Best Answers:
    0
    Trophy Points:
    185
    #4
    
    $str = '<A HREF=/location/$file/filename.MBR> <font size="1" face="Arial> <A HREF=/location/$file/filename.htm><A HREF=/location/$file/filename2.MBR>';
    
    //Match all links
    preg_match_all('/<a href=(.*)>/Uis',$str,$results);
    
    /*
    or only match MBR
    preg_match_all('/<a href=(.*\.MBR)>/Uis',$str,$results);
    */
    
    print_r($results[1]);
    
    PHP:
    Peace,
     
    Barti1987, Nov 26, 2007 IP