I am looking for help scraping this page, http://web.minorleaguebaseball.com/milb/stats/stats.jsp?t=l_trn&lid=117&sid=l117. I need to be able to take the information and print it out on a separate page on my server. Any help with the code would be very helpful. I have been trying to do this for a few weeks and am still stumped. Thank you in advance.
$file="http://web.minorleaguebaseball.com/milb/stats/stats.jsp?t=l_trn&lid=117&sid=l117"; $contents=file_get_contents($file); echo $contents; PHP: this should work
I want to be able to manipulate the data. http://breakpointdesigns.com/test2.php The information is being generated by JavaScript and this does not work. Any more suggestions. I seriously am stumped. Someone once had suggested JSON, but I am not familiar with that.
sorry but thats because the links to the javascript and stylesheets are relative, you would have to do something like $content=str_ireplace("/javascript.js","http://example.com/javascript.js",$content); PHP: right b4 you echo the text for all of them I will PM you the full working thing with all of the replacements and you can test it
It is calling writeData();. How can I generate the contents of that? Any examples would be great. I am not a whiz at PHP. I do not want to copy the entire page, I just want to grab the data and print it on a white page. Please help.
I do not have my own server. I am trying to figure out how to get the data on this shared server and be able to CRON it later. I am sure somebody has to know how to do this. It is really frustrating.
Can anyone figure this out? I have realized it is using JS, but I can't determine where those files are.
Hmm i've had a look but I can't see where this function is coming from. This is how far I got: <? //Location of doc $file="http://web.minorleaguebaseball.com/milb/stats/stats.jsp?t=l_trn&lid=117&sid=l117"; //Get all doc contents $contents=file_get_contents($file); //change stylesheet loc $contents = str_ireplace("<link href=\"","<link href=\"http://web.minorleaguebaseball.com",$contents); //all src $contents = str_ireplace("src=\"","src=\"http://web.minorleaguebaseball.com",$contents); //change inline css loc $contents = str_ireplace("style=\"background: url(","style=\"background: url(http://web.minorleaguebaseball.com",$contents); $contents = str_ireplace("style=\"background-image: url(","style=\"background-image: url(http://web.minorleaguebaseball.com",$contents); //display page echo $contents; ?> PHP: Basically everything is displaying apart from the data. There must be a link i've missed somewhere which is being defined in another way to those i've already replaced. The problem is the data doesn't seem viewable in the source code, well it does sometimes and doesn't sometimes (very odd), if I hover over the info and press view selection source in Firebug it seems to display it, but if i just click view source it doesn't show :s. You need to be able to grab JS data somehow with PHP I guess and find the script where this writeData() function is!!
I have found the source at http://web.minorleaguebaseball.com/...ion_all.bam?league_id=112&start_date=20090712 Can anyone help me with the code to decrypt JSON?
What exactly do you mean by decrypt? You could chop the data up, for example: <?php $file = 'http://web.minorleaguebaseball.com/lookup/json/named.transaction_all.bam?league_id=112&start_date=20090712'; $contents = file_get_contents($file); $contents = str_ireplace("\"player\":","<b>\"PLAYER\":</b>",$contents); $contents = str_ireplace("[","[<br /><br /><B>STARTLIST</B><BR />",$contents); $contents = str_ireplace("}","}<br /><br /><B>NEWPLAYER</B><BR />",$contents); $contents = str_ireplace(",",",<br />",$contents); echo $contents; ?> PHP:
you can also do it with CURL and read the data with regex (you'd need two regular expressions: first to separate data table and then to read rows) ;-)
I'm not familiar with cURL or regex but you could also split the data up with some string functions, for example: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Untitled Document</title> </head> <body> <?php function get_string_between($string, $start, $end){ $string = " ".$string; $ini = strpos($string,$start); if ($ini == 0) return ""; $ini += strlen($start); $len = strpos($string,$end,$ini) - $ini; return substr($string,$ini,$len); } $file = 'http://web.minorleaguebaseball.com/lookup/json/named.transaction_all.bam?league_id=112&start_date=20090712'; $contents = file_get_contents($file); //Remove start of file $pos = strpos($contents, "["); $contents = substr_replace($contents, "",0, $pos); $pos = strpos($contents, "{"); $contents = substr_replace($contents, "",0, $pos); //Remove end of file $pos = strpos($contents, "]"); $contents = substr_replace($contents, "", $pos); //Add Line-Breaks $contents = str_ireplace(",",",<br />",$contents); //Count amount of players for use in loop $players = substr_count($contents,"}"); //Create array of players while($x < $players){ $player[] = get_string_between($contents, "{", "}"); $pos = strpos($contents, "},"); $contents = substr_replace($contents, "",0, $pos); $pos2 = strpos($contents, "{"); $contents = substr_replace($contents, "",0, $pos2); $x++; } foreach ($player as $p){ echo "<h1>Player</h1>"; echo $p; } ?> </body> </html> PHP: obviously the other methods suggested are probably better but i have no experience in them
I have gotten to the point where I need a regular expression. I need to do this: blah blah blah !#(@%O#&%)@*(# [ Remove everything, all characters and letters, before the first [. What is a regex function I can develop for that.
I don't know about regex sorry, but I did this in my file with the following: (assuming the whole file is kept it a variable called $contents) //Remove start of file //Find position of first [ character $pos = strpos($contents, "["); //Add 1 to this position to actually include the [ character $pos += 1; //Now remove everyhing from start position (0) to this first [ character whose position has been defined with $pos $contents = substr_replace($contents, "",0, $pos); PHP:
Here save this as a new PHP file and check it out in your browser: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Untitled Document</title> <style type="text/css" media="screen"> table { margin: 0 auto; border: 3px solid #000; } table th { border: 2px solid #000; padding: 10px; font-size: 20px; } table td { border: 2px solid #000; padding: 10px; } </style> </head> <body> <?php //function to get string between to places function get_string_between($string, $start, $end){ $string = " ".$string; $ini = strpos($string,$start); if ($ini == 0) return ""; $ini += strlen($start); $len = strpos($string,$end,$ini) - $ini; return substr($string,$ini,$len); } //asign our file $file = 'http://web.minorleaguebaseball.com/lookup/json/named.transaction_all.bam?league_id=112&start_date=20090712'; $contents = file_get_contents($file); //Remove start of file $pos = strpos($contents, "["); $pos += 1; $contents = substr_replace($contents, "",0, $pos); //Remove end of file $pos = strpos($contents, "]"); $contents = substr_replace($contents, "", $pos); //Add Line-Breaks $contents = str_ireplace(",",",<br />",$contents); //Count amount of players for use in loop $players = substr_count($contents,"}"); //Create array of players while($x < $players){ $player[] = get_string_between($contents, "{", "}"); $pos = strpos($contents, "},"); $contents = substr_replace($contents, "",0, $pos); $pos2 = strpos($contents, "{"); $contents = substr_replace($contents, "",0, $pos2); $x++; } //create sub array of data foreach ($player as $p){ $name[] = get_string_between($p, "\"player\": \"", "\","); $team[] = get_string_between($p, "\"team\": \"", "\","); $notes[] = get_string_between($p, "\"note\": \"", "\""); } //show table of data contained in sub arrays echo "<table><tr><th>Name</th><th>Team</th><th>Note</th></tr>"; for ($x=0; $x < sizeof($player); $x++){ echo "<tr><td>". $name[$x]." </td>"; echo "<td>". $team[$x]."</td>"; echo "<td>". $notes[$x]."</td></tr>"; } echo "</table>"; // optional loop to show players array /*foreach ($player as $p){ echo "<h1>Player</h1>"; echo $p; }*/ ?> </body> </html> PHP: I didn't transfer all of the columns (just did name, team and note) but you can see the jist of things here, and it stores the data tidly in arrays, so it could easily be inserted into a database