1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

file_get_contents from googles search results

Discussion in 'PHP' started by moe374, Jan 24, 2009.

  1. #1
    I'm fairly new to php. I was attempting to create a php script to automate one of my tasks which i perform regularly, which involved getting some information from a google search query. For example, the query looks like this:

    http://www.google.co.uk/search?hl=en&q=my+search+term&btnG=Google+Search&meta=

    and I would replace the "my+search+term" above with whatever i needed (this was done in the php script).

    So, after figuring out how to code the php, I finally got this program to work yesterday. I was using
    file_get_contents($url, 0);
    PHP:
    Today i was working on it a bit more, then, all of a sudden, I started getting this error:

    Warning: file_get_contents(): HTTP request failed! HTTP/1.0 503 Service Unavailable in /home/content/t/r/o/mySite/html/phpWork/phpWork.php on line 90

    Warning: file_get_contents(http://www.google.com/search?hl=en&q=candy+canes&btnG=Search): failed to open stream: Success in /home/content/t/r/o/tropical6/mySite/phpWork/phpWork.php on line 90
    SEMrush
    Is it possible that google is not allowing me to access their search results through my php script? I am confused because it was working perfectly! and now it doesnt work... very frustrating! Why would it work just a little while ago and not any more? Any suggestions or tips or ideas would be greatly appreciated. Thanks in advance.
     
    moe374, Jan 24, 2009 IP
    SEMrush
  2. prasunsen

    prasunsen Peon

    Messages:
    51
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #2
    prasunsen, Jan 24, 2009 IP
  3. moe374

    moe374 Active Member

    Messages:
    213
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    58
    #3
    It was working though!! perfectly!!
    I tried curl, when i use curl, it tells me clearly that "it looks like an automated script that might be a virus or spyware so..." so CURL doesnt work either!

    I dont understand because IT WAS WORKING just fine with file_get_contents
     
    moe374, Jan 24, 2009 IP
  4. proxywhereabouts

    proxywhereabouts Notable Member

    Messages:
    4,027
    Likes Received:
    110
    Best Answers:
    0
    Trophy Points:
    200
    #4
    Why not code something that add referrer and delay to your script?
    So, google will look at your query like it came from human and not from automated script.
     
    proxywhereabouts, Jan 24, 2009 IP
  5. joebert

    joebert Well-Known Member

    Messages:
    2,152
    Likes Received:
    88
    Best Answers:
    0
    Trophy Points:
    145
    #5
    Google will temporarily block requests from sources making numerous requests in a short period of time, or making the same request numerous times in a row.

    For testing, save a copy of a result page on your server and point file_get_contents at that instead.
     
    joebert, Jan 24, 2009 IP
  6. harrisunderwork

    harrisunderwork Well-Known Member

    Messages:
    1,005
    Likes Received:
    21
    Best Answers:
    0
    Trophy Points:
    135
    #6
    You add to need useragent to curlopt to make it look like as a request coming from a web browser. A simple curl command wont work.
     
    harrisunderwork, Jan 25, 2009 IP
  7. ade92uk

    ade92uk Banned

    Messages:
    41
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
  8. Kaizoku

    Kaizoku Well-Known Member

    Messages:
    1,261
    Likes Received:
    20
    Best Answers:
    1
    Trophy Points:
    105
    #8
    They will block your ip, use the google api.
     
    Kaizoku, Mar 7, 2009 IP
  9. harrisunderwork

    harrisunderwork Well-Known Member

    Messages:
    1,005
    Likes Received:
    21
    Best Answers:
    0
    Trophy Points:
    135
    #9
    File content function does not follow the follow location. Thats why curl is the best option.
     
    harrisunderwork, Mar 8, 2009 IP