screen scraping

Discussion in 'PHP' started by tomcromp, Mar 28, 2010.

  1. #1
    hello I'm looking to grab the live footy links from http://www.footballstreaming.info/streams/todays-links/ need to grab info in-between <!-- <tokoeditarea> --> and <!-- </tokoeditarea> -->
    $url = "http://www.footballstreaming.info/streams/todays-links/";
    
    $raw = file_get_contents($url);
    
    $newlines = array("\t","\n","\r","\x20\x20","\0","\x0B");
    
    $content = str_replace($newlines, "", html_entity_decode($raw));
    
    $start = strpos($content,'<!-- <tokoeditarea> -->');
    
    $end = strpos($content,'<!-- </tokoeditarea> -->',$start) + 8;
    
    PHP:
    is that right so far? now what to do next?

    cheers
    tom
     
    tomcromp, Mar 28, 2010 IP
  2. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #2
    <?php
    
    $url = file_get_contents("http://www.footballstreaming.info/streams/todays-links/");
    
    preg_match('~<!\-\- <tokoeditarea> \-\->(.*)<!\-\- </tokoeditarea> \-\->~s', $url, $a);
    
    echo $a[1];
    
    ?>
    PHP:
     
    danx10, Mar 28, 2010 IP
  3. tomcromp

    tomcromp Active Member

    Messages:
    454
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    60
    #3
    Will this update when the page does? x
     
    tomcromp, Mar 28, 2010 IP
  4. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #4
    Yes, aslong as the content on that page which updates is within <!-- <tokoeditarea> --> <!-- </tokoeditarea> -->
     
    danx10, Mar 29, 2010 IP