Preg_match_all() help

Discussion in 'PHP' started by kallell, Oct 27, 2008.

  1. #1
    Hey guys, I'm trying to use a preg_match_all on the following [example]string:

    loggedin " href="http://www.politico.com/news/stories/1008/14819.html" >This is the title of the article.</a>&#32



    Here's what i've been trying to use:



    preg_match_all(
    ' ',
    $html,
    $posts, // will contain the blog posts
    PREG_SET_ORDER // formats data into an array of posts
    );

    I want to extract the Link and the title out of the above string, what would the pattern look like? I've spent a few hours on this and am stumped, any help would be appreciated.

    Thanks,
    Kal
     
    kallell, Oct 27, 2008 IP
  2. SeanBlue

    SeanBlue Peon

    Messages:
    110
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #2
    You would want something like:

    preg_match_all('/<a.?href="(.+)">(.+)<\/a>/',$html,$posts,PREG_SET_ORDER);

    Alternatively, this will require .php files:

    preg_match_all('/<a.+href="(.+)".+?>(.+)<\/a>/',$html,$posts,PREG_SET_ORDER);

    You might need to escape the = and < symbols but that should at least point you in the right direction.
     
    SeanBlue, Oct 27, 2008 IP
  3. SeanBlue

    SeanBlue Peon

    Messages:
    110
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Almost forgot the $matches attribute:

    preg_match_all('/<a.+href="(.+)".+?>(.+)<\/a>/',$html,$posts,$matches,PREG_SET_ORDER);

    You'll then use them like this:

    echo 'Link: '.$matches[0];
    echo 'Link text: '.$matches[1];

    Though with your flags the returned values may differ, you can find them by doing echo var_dump($matches);
     
    SeanBlue, Oct 27, 2008 IP
  4. ads2help

    ads2help Peon

    Messages:
    2,142
    Likes Received:
    67
    Best Answers:
    1
    Trophy Points:
    0
    #4
    String to test:
    Pattern :
    /<a(.*)href=["|'](.*)["|'](.*)>(.*)<\/a>/
    PHP:
    Test it using Regex Tester in my sig

    BackReference:
    $1 = Attribute, $2 = URL , $3 = Attribute, $4 = Anchor text
     
    ads2help, Oct 27, 2008 IP
  5. kallell

    kallell Peon

    Messages:
    94
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #5
    That's essentially what i was looking for...kudos on that regex checker BTW.



    Curious though,(this may seem like a dumb question) but what would i use to input that patter in the the preg_match_all() function; considering the ["|'] essentially escaping the pattern.

    Also, i need to have the "logged in" function part of the pattern string, there are many instances of a basic <a href=" throughout the page...i need it to make it unique:

    <a id="title_t3_79p50" onmousedown="setClick(this, 'title')" class="title loggedin " href="http://www.politico.com/news/stories/1008/14819.html" >Jury: Sen. Ted Stevens guilty on seven counts.</a>
     
    kallell, Oct 27, 2008 IP