Getting the main image URL from wikipedia

Discussion in 'PHP' started by acr8tiv, Sep 11, 2009.

  1. #1
    I would like to get the URL of the main image on Wikipedia pages so I can automatically download them using PHP.

    For example the image of A. A. Milne: http://en.wikipedia.org/wiki/A._A._Milne

    Can anybody recommend the best way to do this, any opensource scripts?

    Any ideas?
     
    acr8tiv, Sep 11, 2009 IP
  2. GreenStar

    GreenStar Peon

    Messages:
    95
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    I would be interested in this also, I hope someone comes up with something. I searched all over goolgle yahoo and msn and just found a bunch of garbage
     
    GreenStar, Sep 12, 2009 IP
  3. SubZtep

    SubZtep Member

    Messages:
    69
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    48
    #3
    Not easy to get the head image, because Wikipedia has no ID for this picture. By the way I think always the first picture is the head picture. Firstly I put the page source into $page variable. Use curl for this, my way is not the best. If you have the HTML source, you can get the element with DOM. Of course check everything. Finally the image source in $imgSrc.

    $page = file_get_contents('http://en.wikipedia.org/wiki/A._A._Milne');
    $dom = new DOMDocument();
    @$dom->loadXML($page);
    $images = $dom->getElementsByTagName('img');
    $imgSrc = $images->item(0)->getAttribute('src');
    Code (markup):
     
    SubZtep, Sep 12, 2009 IP