get_data IMDB movie database

Discussion in 'PHP' started by MyVodaFone, May 7, 2010.

  1. #1
    I have seen a few post here on DP about how to get movie data from the IMDB website.

    No laughing:eek: but heres what I came up with, would anyone like to clean it up ?

    
    <?php
    
    $url = 'http://www.imdb.com/title/tt0499549/'; // a form here based into an admin panel would be best here...
    
    $imdb = get_data($url);
    
    $movie_pic = get_match('/<a name="poster".+title=".+">(.+)<\/a>/',$imdb);
    $dvd_image = preg_replace('#<img(.*)src="#','',$movie_pic);
    $image = preg_replace('#" />#','',$dvd_image);
    $genres= strip_tags(get_match('/<h5[^>]*>Genre:<\/h5>(.*)<\/div>/isU',$imdb));
    $genre = preg_replace('#See more(.*)#','',$genres);
    $name = get_match('/<title>(.*)<\/title>/isU',$imdb);
    $director = strip_tags(get_match('/<h5[^>]*>Director:<\/h5>(.*)<\/div>/isU',$imdb));
    $about = strip_tags(get_match('/<h5[^>]*>Plot:<\/h5>(.*)<\/div>/isU',$imdb));
    $plot = preg_replace('#Full summary(.*)(|)Full synopsis(.*)#','',$about);
    $release_dates = strip_tags(get_match('/<h5[^>]*>Release Date:<\/h5>(.*)<\/div>/isU',$imdb));
    $release_date = preg_replace('#See more(.*)#','',$release_dates);
    $mpaa = get_match('/<a href="\/mpaa">MPAA<\/a>:<\/h5>(.*)<\/div>/isU',$imdb);
    $rating = preg_replace('#<div[^>]*>#','',$mpaa);
    $run_time = get_match('/Runtime:<\/h5>(.*)<\/div>/isU',$imdb);
    $runtime = preg_replace('#<div[^>]*>#','',$run_time);
    
    
    echo "$image<br />";
    echo "$name<br />";
    echo "$genre<br />";
    echo "$director<br />";
    echo "$plot<br />";
    echo "$release_date<br />";
    echo "$rating<br />";
    echo "$runtime<br />";
    
    // Once you have these above variables, I guess you could INSERT them into a database and echo elsewhere... 
    
    function get_match($regex,$content)
    {
    	preg_match($regex,$content,$matches);
    	return $matches[1];
    }
    
    function get_data($url)
    {
    	$ch = curl_init();
    	$timeout = 5;
    	curl_setopt($ch,CURLOPT_URL,$url);
    	curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
    	curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
    	$data = curl_exec($ch);
    	curl_close($ch);
    	return $data;
    }
    ?>
    
    PHP:
     
    MyVodaFone, May 7, 2010 IP
  2. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #2
    I've seen that code on sourceforge - or elsewhere before.

    Looks ok to me, but you can consider using DOM instead of regular expressions, as you may find it easier to work with.
     
    danx10, May 7, 2010 IP
  3. lorkan

    lorkan Peon

    Messages:
    20
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Yes, like danx10 says, try using DOM, which will speed things up for you and will be easier to understand technically:
    Read more on www(dot)w3schools(dot)com(slash)htmldom(slash)default(dot)asp
    or
    www(dot)w3schools(dot)com(slash)jsref(slash)default(dot)asp
     
    lorkan, May 7, 2010 IP