[Need Help] PHP Scraping

Discussion in 'PHP' started by LeetPCUser, Jul 10, 2009.

  1. LeetPCUser

    LeetPCUser Peon

    Messages:
    711
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #21
    Thanks! The other information I would like to get include, the date, the type and the team id.
     
    LeetPCUser, Jul 14, 2009 IP
  2. wd_2k6

    wd_2k6 Peon

    Messages:
    1,740
    Likes Received:
    54
    Best Answers:
    0
    Trophy Points:
    0
    #22
    No probs, check this example:
    (BTW it might take a while to load, I changed the date to the beggining of the year, so it brings up a lot of records! But it seems to be working well.)

    
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Untitled Document</title>
    <style type="text/css" media="screen">
    table { margin: 0 auto; border: 3px solid #000; }
    table th { border: 2px solid #000; padding: 10px; font-size: 20px; }
    table td { border: 2px solid #000; padding: 10px; }
    </style>
    </head>
    
    <body>
    <?php
    //function to get string between to places
    function get_string_between($string, $start, $end){
            $string = " ".$string;
            $ini = strpos($string,$start);
            if ($ini == 0) return "";
            $ini += strlen($start);   
            $len = strpos($string,$end,$ini) - $ini;
            return substr($string,$ini,$len);
    }
    
    //asign our file
    $file = 'http://web.minorleaguebaseball.com/lookup/json/named.transaction_all.bam?league_id=112&start_date=20090112';
    $contents = file_get_contents($file);
    
    //Remove start of file
    $pos = strpos($contents, "[");
    $pos += 1;
    $contents = substr_replace($contents, "",0, $pos);
    
    //Remove end of file
    $pos = strpos($contents, "]");
    $contents = substr_replace($contents, "", $pos);
    
    //Add Line-Breaks
    $contents = str_ireplace(",",",<br />",$contents);
    
    //Count amount of players for use in loop
    $players = substr_count($contents,"}");
    
    //Create array of players
    while($x < $players){
    $player[] = get_string_between($contents, "{", "}");
    $pos = strpos($contents, "},");
    $pos += 1;
    $contents = substr_replace($contents, "",0, $pos);
    $x++;
    }
    
    
    //create sub array of data
    foreach ($player as $p){
    	$names[] = get_string_between($p, "\"player\": \"", "\",");
    	$teams[] = get_string_between($p, "\"team\": \"", "\",");
    	$notes[] = get_string_between($p, "\"note\": \"", "\"");
    	$types[] = get_string_between($p, "\"type\": \"", "\",");
    	$teamIDs[] = get_string_between($p, "\"team_id\": \"", "\",");
    	$dates[] = get_string_between($p, "\"trans_date\": \"", "T00:00:00");
    }
    
    //show table of data contained in sub arrays
    echo "<table><tr><th>Name</th><th>Type</th><th>Team ID</th><th>Team</th><th>Notes</th><th>Date</th></tr>";
    for ($x=0; $x < sizeof($player); $x++){
    	echo "<tr><td>". $names[$x]." </td>";
    	echo "<td>". $types[$x]."</td>"; 
    	echo "<td>". $teamIDs[$x]."</td>"; 
    	echo "<td>". $teams[$x]."</td>"; 
    	echo "<td>". $notes[$x]."</td>";
    	echo "<td>". $dates[$x]."</td></tr>"; 
    }
    echo "</table>";
    
    // optional loop to show players array
    /*foreach ($player as $p){
    	echo "<h1>Player</h1>";
    	echo $p;
    }*/
    ?>
    </body>
    </html>
    
    PHP:
     
    wd_2k6, Jul 14, 2009 IP
    LeetPCUser likes this.
  3. LeetPCUser

    LeetPCUser Peon

    Messages:
    711
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #23
    AMAZING! Thank you so much for your help. This is exactly what I was looking for. As for the speed. I am going to populate per day in the database :).

    Thanks.
     
    LeetPCUser, Jul 14, 2009 IP
  4. tusherdcc

    tusherdcc Peon

    Messages:
    16
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #24
    The information are very useful for me .. let me try for a while. thanks
     
    tusherdcc, Jan 13, 2010 IP