Grab certain line/place of html file?

Discussion in 'PHP' started by Malky, Apr 13, 2009.

  1. #1
    Is there a way to grab a html file from another server, then 3 specific words from the file?

    These words are static.

    I know the first part can be done via curl, which i have installed, but how would i go about it?

    Thanks.

    Edit: An example is, that i want the result to be that the line i want is put into a variable.

    E.g. I want the script to grab line 42, Col 32-40 from http://news.bbc.co.uk/1/hi/world/middle_east/7996962.stm and put width=974 into $grabbed.
     
    Malky, Apr 13, 2009 IP
  2. Dennis M.

    Dennis M. Active Member

    Messages:
    119
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    58
    #2
    hmm. What are you looking to do? You just want content from the page or the coding of the page?

    You can use something like this:

    
    <?php
    /**
     * Simple cURL Script by Dennis M.
     *
     */
    
    // Begin cURL session
    $ch = curl_init();
    
    // Define options
    curl_setopt($ch, CURLOPT_URL, "LINKTO/PAGE.PHP"); // The URL to go to
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // Set to 1 if you want page to display with print
    
    // Execute
    $data = curl_exec($ch);
    
    /**
     * From here you can work with the data you've received
     * from the server. If you want to match certain words
     * everytime you can use preg_match and regexp. If you
     * just simply wish to print the page use the print function
     * if there's something else, let me know on the forum! :)
     *
     */
    
    // Let's just see the page
    print $data;
    
    // Cleanup..
    curl_close($ch);
    
    ?>
    PHP:
    Regards,
    Dennis M.
     
    Dennis M., Apr 13, 2009 IP
  3. Malky

    Malky Member

    Messages:
    43
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    43
    #3
    Thanks for the reply Dennis.

    I want to grab a certain line from the page's source code, which i have no idea how to go about.

    The actual line number is static, but a part of the line is dynamic to the url and that is what i want to grab. However, if grabbing 3-4 lines of the code is much easier, i could just do that.
     
    Malky, Apr 13, 2009 IP
  4. antigravity

    antigravity Active Member

    Messages:
    8
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    86
    #4
    Is part of the line you want to grab static? As in, can you be sure the same text and/or tags appear around the dynamic text you want?

    If so, use a regular expression to get what you want into a variable.

    For example... using Dennis's code above and assuming is your dynamic test was always surrounded like so...

    The answer is [dynamic text here]!

    You could say something like...

    preg_match('/The answer is (.+?)!/', $data, $matches);
    echo $matches[0];
     
    antigravity, Apr 13, 2009 IP
  5. SmallPotatoes

    SmallPotatoes Peon

    Messages:
    1,321
    Likes Received:
    41
    Best Answers:
    0
    Trophy Points:
    0
    #5
    To grab line 42, you just need this (no need to mess around with curl):
    $bbc = file("http://news.bbc.co.uk/1/hi/world/middle_east/7996962.stm");
    $line = $bbc[41];
    Code (markup):
    But if what you want is that 'width' value, I'd do it this way:
    $bbc = file_get_contents("http://news.bbc.co.uk/1/hi/world/middle_east/7996962.stm");
    if (preg_match('`<meta name="viewport" content="(width=\d+)"/>`', $bbc, $matches))
    {
       $width_str = $matches[1];
       echo "Got it: {$width_str}";
    }
    Code (markup):
    It's more robust than expecting the tag to always be on exactly the same line; at any time the BBC may add or remove headers or tags that push it up or down in the source.
     
    SmallPotatoes, Apr 13, 2009 IP
  6. Malky

    Malky Member

    Messages:
    43
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    43
    #6
    Thanks guys, perfect!
     
    Malky, Apr 14, 2009 IP