Why is cURL so inconsistent and how to fix it?

Discussion in 'PHP' started by gbh, Dec 13, 2008.

  1. #1
    An example is:
    
    <?php
    
    	$website = 'http://digg.com/';
    	$ch = curl_init();
    	$timeout = 5;
    	curl_setopt ($ch, CURLOPT_URL, $website);
    	curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
    	curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    	$uWebsite = curl_exec($ch);
    	curl_close($ch);
    	echo $uWebsite;
    
    ?>
    PHP:
    $uWebsite returns nothing!

    But if you change $website to for instance http://www.example.com/ it works. Why does it not work for some websites? :confused:
     
    gbh, Dec 13, 2008 IP
  2. DeViAnThans3

    DeViAnThans3 Peon

    Messages:
    785
    Likes Received:
    83
    Best Answers:
    0
    Trophy Points:
    0
    #2
    The timeout you specified might be too short. At least, it is shorter than the system timeout :)
    Try raising the timeout.

    Also, sites may block certain (unknown) user agents. This can certainly be the case with Digg, as they try to block autosubmitting/voting bots. You can try omitting that limitation by changing your useragent with the CURLOPT_USERAGENT setting. However, I'm not sure if that will actually work.
     
    DeViAnThans3, Dec 13, 2008 IP
    gbh likes this.
  3. gbh

    gbh Member

    Messages:
    78
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    48
    #3
    Oh wow you clever something!

    You seem to have gotten it spot on? It's working now using this:
    curl_setopt ($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.4) Gecko/2008102920 Firefox/3.0.4');
    PHP:
    What user agent do you suggest I use? I am just currently using my own.

    Thanks for your quick and accurate reply! I love that this forum has loads of smart people like you willing to help :)
     
    gbh, Dec 13, 2008 IP
  4. ColorWP.com

    ColorWP.com Notable Member

    Messages:
    3,121
    Likes Received:
    100
    Best Answers:
    1
    Trophy Points:
    270
    #4
    ColorWP.com, Dec 13, 2008 IP
  5. gbh

    gbh Member

    Messages:
    78
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    48
    #5
    Oh why do you suggest I make them random? Does it prevent sites from blocking mine?
     
    gbh, Dec 13, 2008 IP
  6. matthewrobertbell

    matthewrobertbell Peon

    Messages:
    781
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    0
    #6
    No site would block firefox's user agent.
    No sensible site anyways.
     
    matthewrobertbell, Dec 13, 2008 IP
  7. brownskinman

    brownskinman Peon

    Messages:
    18
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #7
    You may also set / spoof the referer:

    <?php
    $ref = 'http://digg.com/';
    curl_setopt($ch, CURLOPT_REFERER, $ref);
    ?>
    PHP:
     
    brownskinman, Dec 13, 2008 IP
  8. ColorWP.com

    ColorWP.com Notable Member

    Messages:
    3,121
    Likes Received:
    100
    Best Answers:
    1
    Trophy Points:
    270
    #8
    Well, Digg won't block the default Firefox user agent, but it wouldn't hurt to rotate them randomly doesn't it?
     
    ColorWP.com, Dec 13, 2008 IP
  9. chopsticks

    chopsticks Active Member

    Messages:
    565
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    60
    #9
    ^^^

    My thoughts would be that they might get suspicious if there is alot of significant changes to the useragent throughout a bunch of requests. Unless your suggesting a change every few hours.
     
    chopsticks, Dec 13, 2008 IP
  10. gbh

    gbh Member

    Messages:
    78
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    48
    #10
    Oh awesome, thank you guys! It's not only for digg.com, it's for any random site.

    But I think I will rotate some random user agents :)
     
    gbh, Dec 13, 2008 IP