Scan remote urls with PHP - Best Way

Discussion in 'PHP' started by blackburn2413, Mar 23, 2013.

  1. #1
    So here is the situation I have. I have a URL structure that I need to scan of a remote site. It is a bit like this:
    http://www.testurl.com/index.php?event=123456

    That url will then redirect the user to a pretty vanity url that includes the name of the event that the page is representing ie: http://www.testurl.com/2014-test-event/

    I need to scan a range of about 10,000 event numbers and pull all of the vanity urls for it and wondering if I am making this more difficult that it should be. I am then checking each pulled URL for a keyword. I have curl doing it currently and then the HTTP 301 error is returning the vanity url so that is one way I am retrieving it. The script is unreliable though at this point and just stays loading until the entire script is done so I can't see any progress and I dont know if it is still working.

    Here's the code I have so far:

    <?php
     
    $i = 0;
    $number = 1000; //this is the starting event ID to search
     
    while ($i <= 10000) {
    $url = 'www.testurl.com/index.php?event_id='.$number;
    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
    $a = curl_exec($ch);
     
    if (stripos($a, "QUERY DATA GOES HERE") !== false) {
        echo $a.'<br>';
    }
    echo $number.'<br>';
    $i = $i + 1;
    $number = $number + 1;
    curl_close($ch);
    }
     
    ?>
    PHP:
    So basically the end goal is to check each URL in incremental order by event ID. Then once I pull the URL, check it for a certain keyword (QUERY DATA GOES HERE). If there is a match, then the script will echo that URL.

    That's the goal here as well as well as finding a way to have the echo work in real time and not just blast me with text once the script is done executing...

    Any ideas?
     
    Solved! View solution.
    blackburn2413, Mar 23, 2013 IP
  2. #2
    You can use output buffering to send data to the browser while the script is still running.
     
    jestep, Mar 25, 2013 IP
  3. blackburn2413

    blackburn2413 Member

    Messages:
    45
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    41
    #3
    I knew there was something just couldn't put my finger on it. Thanks jestep
     
    blackburn2413, Mar 25, 2013 IP
  4. Vick.Kumar

    Vick.Kumar Active Member

    Messages:
    138
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    90
    #4
    You could even use 'simple html dom'.
     
    Vick.Kumar, Mar 27, 2013 IP