help with regular expression

Discussion in 'PHP' started by bvspoguk, May 23, 2007.

  1. #1
    Hi
    I am having some trouble with a regex, i want to get all the urls in anchor tags of a web page but from the example below i get script tags as well. All i want back is http://thisissomesite.com and not the html tags
    Could anyone advice on how to fix this as it is driving me nuts

    preg_match_all("/(<([\w]+)[^>]*>)(.*)(<\/\\2>)/", $result, $output, PREG_SET_ORDER);
     
    bvspoguk, May 23, 2007 IP
  2. wmtips

    wmtips Well-Known Member

    Messages:
    601
    Likes Received:
    70
    Best Answers:
    1
    Trophy Points:
    150
    #2
    Here is working code:
    preg_match_all('/<\s*a[^<>]*?href=[\'"]?([^\s<>\'"]*)[\'"]?[^<>]*>(.*?)<\/a>/si', $html, $match, PREG_SET_ORDER))
    PHP:
     
    wmtips, May 23, 2007 IP
  3. bvspoguk

    bvspoguk Peon

    Messages:
    103
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #3
    thanks going to try it this evening.
     
    bvspoguk, May 24, 2007 IP