[RegEx] Extracting src attribute from embed tag

Discussion in 'PHP' started by eamiro, Jul 23, 2009.

  1. #1
    I want to extract the src attribute from <embed> tag, what regular expression should I use?

    i've tried:

    preg_match("/<embed\s[^>]*src=(\”??)([^\" >]*?)\\1[^>]*>(.*)<\/embed>/i", $url, $m)
    Code (markup):

     
    eamiro, Jul 23, 2009 IP
  2. kblessinggr

    kblessinggr Peon

    Messages:
    539
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    0
    #2
    It's not regular expression but it would do the same trick without blowing a gasket over preg_match.

    
    $fpos = stripos($url, "src=");
    if($fpos !== false)
    {
       $nstr = substr($url, $fpos+4);
       $fpos = strpos($nstr, "\"");
       if($fpos === false) { $fpos = strpos($nstr, "'"); }
    
       $m = substr($nstr, 0, $fpos-1);
    }
    
    
    PHP:
    Basically finds src=, snips everything from that and front off, finds the quote ending it, strips everything from that and after that off. Thus remaining with the src url.

    :D
     
    kblessinggr, Jul 23, 2009 IP
  3. Chemo

    Chemo Peon

    Messages:
    146
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Alternately, you can use DOM.

    
    <?php
        $doc = new DOMDocument();
        $doc->loadHTML('<html><body><p>Test</p><embed src="html.mov" width="50" height="100" /></body></html>');
    
        $embeds = $doc->getElementsByTagName('embed');
    
        foreach( $embeds as $embed ){
            echo $embed->getAttribute('src');
        }
    ?>
    
    PHP:
     
    Chemo, Jul 23, 2009 IP
  4. kblessinggr

    kblessinggr Peon

    Messages:
    539
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Yep that can work quite well for the purpose, but not everyone has the "DOM/XML" module installed for PHP.

    However DOM might actually be part of the core libraries now as part of PHP5.

    Simply save a php script with <? phpinfo(); ?> in it, and run it, you'll see a table with "DOM" over it, if you got it.
     
    kblessinggr, Jul 24, 2009 IP
  5. matthewrobertbell

    matthewrobertbell Peon

    Messages:
    781
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    0
    #5
    preg_match('<embed.+?src="(.+?)".+?<\/embed>',$result); would be a little cleaner.

    hi kb ;)
     
    matthewrobertbell, Jul 24, 2009 IP
  6. kblessinggr

    kblessinggr Peon

    Messages:
    539
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Hey :D

    My Regex is quite a bit rusty and I would have suggested that if I remebered the '?'

    To those unfamiliar, placing <embed.+? in there means that it begins with <embed, but 'could' have anything between it and src= (the ? is so that it doesn't have to have something between them, but it might) , and the second ? is because the value of src could be one or more characters. and again on the rest. I guess since we're not validating the url and simply trying to retreive it, (.+?) matt mentioned, makes more sense.
     
    kblessinggr, Jul 24, 2009 IP