Click On Tyler Hall

Originally from Nashville, I'm an engineer working for Yahoo! in Sunnyvale and paying my way with PHP and Cocoa.

Scraping IMDB With PHP

Tuesday, May 20 2008

For an upcoming project, I need to pull in metadata about movies and TV shows — genres, plot summaries, actors, etc. The de-facto source is, of course, IMDB. Unfortunately, they’re behind the times and don’t offer an API to access their data. (At least not one that I’ve ever found.)

So, here’s a quick PHP class that takes a movie title (doesn’t have to be exact) or a filename (!) and scrapes IMDB for the relevant info.

Using the scraper is simple.

$m = new MediaInfo();
$info = $m->getMovieInfo('American Beauty');
print_r($info);

will output:

Array
(
    [kind] => movie
    [id] => tt0169547
    [title] => American Beauty
    [rating] => 8.6
    [director] => Sam Mendes
    [release_date] => 1 October 1999
    [plot] => Lester Burnham, a depressed suburban father in a mid-life crisis, decides to turn his hectic life around after developing an infatuation for his daughter's attractive friend.
    [genres] => Array
        (
            [0] => Drama
        )
    [cast] => Array
        (
            [Kevin Spacey] => Lester Burnham
            [Annette Bening] => Carolyn Burnham
            [Thora Birch] => Jane Burnham
            [Wes Bentley] => Ricky Fitts
            [Mena Suvari] => Angela Hayes
            [Chris Cooper] => Col. Frank Fitts, USMC
            [Peter Gallagher] => Buddy Kane
            [Allison Janney] => Barbara Fitts
            [Scott Bakula] => Jim Olmeyer
        )
)

At the moment, the class only returns data for movies. For TV shows I’m planning on pulling data directly from the database I’ve created for Schmooze.TV (which, in turn, scrapes its info from TVRage).

You can download the source from my Google Code project. As always, this code is released under the MIT License. Comments and suggestions are always welcome.


Reader Comments


  1. Brett Says:

    This is amazing. Are you going to add it to Simple PHP Framework?

  2. Tyler Says:

    Since it's not a core feature that most web application would use, I won't add it to the Framework, but it should always be available from my personal Google Code repository. I'll continue to update it there as I make improvements.

  3. chrismeller Says:

    Nice class. I'd love to see the TVRage one. :)

  4. Tyler Says:

    So would I ;-) I'm still working on it at the moment.

  5. Patrik Says:

    Very nice !
    Thanx alot, im using it to keep track of my movielibrary.

    /P


Leave a Reply