1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Get data from between HTML tags

Discussion in 'PHP' started by RxDx, Jul 27, 2010.

  1. #1
    Hello, I am using :

    $pattern = "#<caption([^>]*[^/])>#i";
    
    preg_match_all($pattern ,$reply,$match); // Tested pattern
    //echo $match[0];
    
    print_r($match);
    PHP:
    To get everything between caption tags. Alltogether there are 2 caption tags and therefore 2 strings. However I get empty array as a result :

    Array
    (
        [0] => Array
            (
            )
    
        [1] => Array
            (
            )
    
    )
    
    PHP:
    What am i doing wrong?
     
    RxDx, Jul 27, 2010 IP
  2. Nick66

    Nick66 Peon

    Messages:
    19
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Try this pattern and will work.

    $pattern = "#<caption\b[^>]*>(.*?)</caption>#i";
    Code (markup):
     
    Nick66, Jul 27, 2010 IP
  3. RxDx

    RxDx Guest

    Messages:
    44
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I still have empty array.
     
    RxDx, Jul 27, 2010 IP
  4. ze0xify

    ze0xify Peon

    Messages:
    13
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #4
    In PHP, you define a REGEX statement by using forward slash at the beginning and end.

    $pattern = '/<caption\b[^>]*>([\s\S]*)</caption>/i';
    preg_match_all($pattern, $reply, $matches);
    print_r($matches);
    Code (markup):
     
    ze0xify, Jul 27, 2010 IP
  5. RxDx

    RxDx Guest

    Messages:
    44
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Hm, I get this :

    <b>Warning</b>:  preg_match_all() [<a href='function.preg-match-all'>function.preg-match-all</a>]: Unknown modifier 'c' in <b>C:\wamp2\www\AutoUpdater\test2.php</b> on line <b>33</b><br />
    
    Code (markup):
    Sorry to provide source code, can't copy from browser due to some effects there.

    Line 33 is :
    preg_match_all($pattern, $reply, $matches);
    
    PHP:
     
    RxDx, Jul 27, 2010 IP
  6. Nick66

    Nick66 Peon

    Messages:
    19
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    $pattern = "#<caption\b[^>]*>(.*?)</caption>#i";
    I tested this pattern. It is certainly works but keep in mind this:
    Will not properly match tags nested inside themselves, like in <caption>text1<caption>text2</caption>text3</caption>.
    It would be helpfull if you could provide here the $reply (or part of it maybe).
     
    Nick66, Jul 28, 2010 IP
  7. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #7
    Wrong, forward slashes are probably the most common characters used for the delimiters, but you don't neccesarily need to use forward slashes, I prefer to use tildes (~), you can use any character which is non-alphanumeric, non-backslash or non-whitespace, refer to the documenation:

    http://www.php.net/manual/en/regexp.reference.delimiters.php
     
    danx10, Jul 28, 2010 IP
  8. RxDx

    RxDx Guest

    Messages:
    44
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Nick,

    your pattern works, but I get :

    Array
    (
        [0] => Array
            (
                [0] => <caption\b[^>]*>(.*?)</caption>
    
            )
    
        [1] => Array
            (
                [0] => ]*>(.*?)
            )
    
    )
    
    PHP:
    and not actual content between two tags...

    For one reason I cannot post here the actual source code of the back-end site...
     
    RxDx, Jul 28, 2010 IP
  9. ThePHPMaster

    ThePHPMaster Well-Known Member

    Messages:
    737
    Likes Received:
    52
    Best Answers:
    33
    Trophy Points:
    150
    #9
    The best approach to regular expression is to think of the easier way to do it:

    
    preg_match_all('/<caption.*>(.*)<\/caption>/Uism',$reply,$match)
    print_r($match[1]);
    
    PHP:
     
    ThePHPMaster, Jul 28, 2010 IP
  10. RxDx

    RxDx Guest

    Messages:
    44
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Array
    (
    )
    
    PHP:
    This is what I get, an empty array...
     
    RxDx, Jul 28, 2010 IP
  11. ThePHPMaster

    ThePHPMaster Well-Known Member

    Messages:
    737
    Likes Received:
    52
    Best Answers:
    33
    Trophy Points:
    150
    #11
    Do you have sample data?

    I tried it with this data and its working:

    
    $reply = '<caption name="test">Hi Caption 1</caption> ljhsldjkf ljkhl
    o3i;j;
    <caption>Test Caption 2</caption>';
    
    preg_match_all('/<caption.*>(.*)<\/caption>/Uism',$reply,$match);
    print_r($match[1]);
    
    PHP:
     
    ThePHPMaster, Jul 28, 2010 IP
  12. RxDx

    RxDx Guest

    Messages:
    44
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Well, I tried on simple data, it works fine aswell. But, my $reply is really messy, lots of HTML tags and so on. there are 2 caption tags with content : Latest Version X.X.X. It provides the number of tags but not the content inside them...
     
    RxDx, Jul 28, 2010 IP
  13. ThePHPMaster

    ThePHPMaster Well-Known Member

    Messages:
    737
    Likes Received:
    52
    Best Answers:
    33
    Trophy Points:
    150
    #13
    Can you provide a sample data?

    If it is confidential, you can send it via PM.
     
    ThePHPMaster, Jul 28, 2010 IP
  14. RxDx

    RxDx Guest

    Messages:
    44
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #14
    I have sent it to you, many thanks for help.
     
    RxDx, Jul 28, 2010 IP
  15. ThePHPMaster

    ThePHPMaster Well-Known Member

    Messages:
    737
    Likes Received:
    52
    Best Answers:
    33
    Trophy Points:
    150
    #15
    This should work,

    
    preg_match_all('/caption&gt;(Latest Version.*)&lt;<span/Uism',$reply,$match);
    
    // If you just want the version number
    preg_match_all('/caption&gt;Latest Version (.*)&lt;<span/Uism',$reply,$match);
    
    PHP:
     
    ThePHPMaster, Jul 28, 2010 IP
  16. RxDx

    RxDx Guest

    Messages:
    44
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #16
    Hello, this gives me
    Array
    (
        [0] => Array
            (
                [0] => caption&gt;Latest Version (.*)&lt;<span
            )
    
        [1] => Array
            (
                [0] => (.*)
            )
    
    )
    
    PHP:
    Can a problem be in that I am curling the page and then doing preg_match? Here is the code of cURL page :

    $reply = curl_exec($ch);
    curl_close($ch);
    // Get page content
    echo $reply;
    
    // Search for captions to receive versions of products
    
    
    
    $html = file_get_contents('test2.php'); // test 2 is a current file
    $pattern = "/<body[^>]*>(.*?)<\/body>/";
    preg_match_all('/caption&gt;Latest Version (.*)&lt;<span/Uism',$html,$match);
    print_r($match);
    PHP:
     
    RxDx, Jul 28, 2010 IP
  17. ThePHPMaster

    ThePHPMaster Well-Known Member

    Messages:
    737
    Likes Received:
    52
    Best Answers:
    33
    Trophy Points:
    150
    #17
    You are not using CURL but file_get_contents.

    I tested it with the file you provided and it works.

    Just make sure that $html actually contains the HTML page you want.
     
    ThePHPMaster, Jul 28, 2010 IP
  18. RxDx

    RxDx Guest

    Messages:
    44
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #18
    I am using cURL in beggining. I log in using it to a private member area and send some JSON code to server and get HTML content in exchange (this is to emulate user-login and opening of download page).
    Then I want to search given HTML for data between tags. So I do file_get_contents of the current file(in which I have all php, cURL, json code to be sent, etc). It also contains the HTML(i sent you the source code).
    So, the $html actually contains what I want, but responces me with empty arrays.
    I tried changing $html with $reply(which is basically the same, because $reply is answer from server with HTML inside), but still empty array.
     
    RxDx, Jul 28, 2010 IP
  19. Thorlax402

    Thorlax402 Member

    Messages:
    194
    Likes Received:
    2
    Best Answers:
    5
    Trophy Points:
    40
    #19
    If you want to do it without preg_match then you can try this:

    
    function get_tag_info($string, $tag) {
    	$start = '<'.$tag.'>';
    	$end = '</'.$tag.'>';
        $ini = strpos($string,$start); 
        if ($ini === false)
    		return ""; 
        $ini += strlen($start); 
        $len = strpos($string,$end,$ini) - $ini; 
        return substr($string,$ini,$len); 
    }
    
    PHP:
    just run get_tag_info($html, 'html');
     
    Thorlax402, Jul 29, 2010 IP
  20. RxDx

    RxDx Guest

    Messages:
    44
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #20
    Thanks, that works great!
     
    RxDx, Jul 29, 2010 IP