How to efficiently scrape Google?

Discussion in 'PHP' started by jnelson563, Sep 30, 2009.

  1. #1
    Can anybody steer me in the right direction on how to efficiently scrape rankings from Google?

    I created a program that uses cURL to get the results of a search query like google.com/search?=$keyword&num=$iterator

    where the iterator increments by 1 until it gets to twenty. So if it finds your website after 5 iterations it stores the ranking in a database. It works great, BUT, this is kind of crazy to be querying Google so many times just to find a rank and I don't want to incur a ban.

    Is it possible to just query Google one time, match the pattern preg_match("/<h3 class=r><a href=\"(.*)\">(*.)<\/a><\/h3>/") which is the style attached to the urls, and somehow figure out what rank it is, or what number down the array it is?

    I am fairly new to this so any info would be great
     
    jnelson563, Sep 30, 2009 IP
  2. stephan2307

    stephan2307 Well-Known Member

    Messages:
    1,277
    Likes Received:
    33
    Best Answers:
    7
    Trophy Points:
    150
    #2
    Curl can handle cookies. So use them.

    Before you query google change the Search Settings to the maximum number of listings you want.

    Second yes you can find out which item it is in an array. Try print_r($array) and you will see how.

    Good Luck
     
    stephan2307, Sep 30, 2009 IP