Parsing Information?

Discussion in 'PHP' started by LeetPCUser, Feb 7, 2007.

  1. #1
    How can I parse information from one site to another. For instance I want to create a site of my own that takes the rankings of NCAA basketball teams and displays them on my site.

    I want to parse this: http://sports.yahoo.com/ncaab/polls?poll=1

    Any help would be greatly appreciated.
     
    LeetPCUser, Feb 7, 2007 IP
  2. rays

    rays Active Member

    Messages:
    563
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    58
    #2
    Best option is to use API's (I am not sure if any are available)

    if you want to stick to parsing the page then following snippet will help you out to collect HTML output of given url

    <?php
    $url = "http://www.example.com/?query=10";
    $handle = @fopen($url, "r");
    if ($handle) {
       while (!feof($handle)) {
           $buffer = fgets($handle, 4096);
           echo $buffer;
       }
       fclose($handle);
    }
    ?> 
    Code (markup):
     
    rays, Feb 7, 2007 IP
  3. jestep

    jestep Prominent Member

    Messages:
    3,659
    Likes Received:
    215
    Best Answers:
    19
    Trophy Points:
    330
    #3
    Ideally you can find the information that you need in the form of an XML feed. With the feed, you can develop a script that parses the feed, and generates content based on that feed on your website.

    If you don't have access to the appropriate feed, you are going to have to figure out how to scrape the content off the website and then format and display it on your website. Functions like file_get_contents() can get an entire webpage for you, and you can use a variety of regular expressions or other functions to strip the data you don't need. I prefer file_get_contents() to using the fopen() function.

    In either situation, you need to be aware of any copyright on the data you get. It may be illegal to display that data for other users on your website, and they can know from their log files that you are taking it.
     
    jestep, Feb 7, 2007 IP
  4. krakjoe

    krakjoe Well-Known Member

    Messages:
    1,795
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    135
    #4
    
    <?php
    function parse_scores( )
    { 
      $data = @file_get_contents("http://sports.yahoo.com/ncaab/polls?poll=1");
    
      $regex = "/<tr class=\"ysprow[1|2]\" align=\"center\" valign=\"top\">(.*?)<\/tr>/si";
      
      preg_match_all( $regex, $data, $matches );
    
      foreach($matches[0] as $row )
      {
        $clean[] = split( "\n", strip_tags( $row ));
      }
      
      foreach($clean as $key => $value )
      {
        $rank = preg_replace("/[^0-9]/", "", $clean[$key][1] );
        $games = split( "game:", trim($clean[$key][4]) );
        $return[$rank]["rank"] = $rank; // huh ??
        $return[$rank]["team"] = trim($clean[$key][3]);
        $return[$rank]["lgame"] = str_replace( "Next", "", trim($games[1]) );
        $return[$rank]["ngame"] = trim($games[2]);
        $return[$rank]["record"] = trim($clean[$key][6]);
        $return[$rank]["pts"] = trim($clean[$key][7]);
      }
      return $return;
    }
    echo "<pre>";
    print_r(parse_scores());
    ?>
    
    PHP:
    That works ....got bored though sorry ......
     
    krakjoe, Feb 7, 2007 IP
  5. LeetPCUser

    LeetPCUser Peon

    Messages:
    711
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Wow that last one was great.

    http://bryansreviews.com/ranks.php

    How can I clean up the results and just print the values in a for loop? For instance a loop that prints:

    1
    Florida (72)
    vs Tennessee, W 94-78 (2/3)
    at Georgia (2/7)
    (21-2)
    1800

    Also, Joe can you explain it to me so I can do this for other things like football and such later on?
     
    LeetPCUser, Feb 7, 2007 IP
  6. krakjoe

    krakjoe Well-Known Member

    Messages:
    1,795
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    135
    #6
    
    foreach( parse_scores() as $rank => $data )
    {
      ?>
      Team : <?=$data['team'] ?><br />
      Rank : <?=$rank ?><br />
      Record : <?=$data['record'] ?><br />
      Last Game : <?=$data['lgame'] ?><br />
      Next Game : <?=$data['ngame'] ?><br />
      Points : <?=$data['pts'] ?><br />
      <?
    }
    
    PHP:
    If you're gonna use that script, I suggest a database of your own and running the script via cron every 10/30/60 minutes, else pages will take forever to load it'll be much more efficient if you do it that way.

    EDIT : by the way : delete echo "<pre>"; and print_r(parse_scores()); from the bottom of the script
     
    krakjoe, Feb 7, 2007 IP
  7. LeetPCUser

    LeetPCUser Peon

    Messages:
    711
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Alright. How would I write this information to my database and check for updates let's say every 30 minutes?
     
    LeetPCUser, Feb 7, 2007 IP
  8. krakjoe

    krakjoe Well-Known Member

    Messages:
    1,795
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    135
    #8
    kebab wait .....
     
    krakjoe, Feb 7, 2007 IP
  9. LeetPCUser

    LeetPCUser Peon

    Messages:
    711
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #9
    kebab? What does that mean?
     
    LeetPCUser, Feb 7, 2007 IP
  10. LeetPCUser

    LeetPCUser Peon

    Messages:
    711
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #10
    The database is called ncaabasketball

    I want 7 complete entries in a database called rankings

    rank
    team
    last
    next
    record
    points
    img (probably a blob I have to figure this out later)

    What would be the MySQL statement to create that as well as update it every 30 minutes?
     
    LeetPCUser, Feb 7, 2007 IP