CURL Redirect not working for alibaba.com, Help!

Discussion in 'PHP' started by x11joex11, Dec 26, 2007.

  1. #1
    First of all, thanks for looking =).

    Here is a function I wrote.

    function getResultFromURL($URL)
    {
    	$ch = curl_init($URL);
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    	curl_setopt($ch, CURLOPT_HEADER, 0);//no need to get header information
    	curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    
    	$result=curl_exec($ch);
    	
    	return $result;//returns a CURL resource, with the source of the page or an ERROR
    }
    PHP:
    This works on any other page given a URL and returns the site correctly. However for the site http://www.alibaba.com it doesn't work! It brings me to this cpanel thing instead. How can I make CURL correct go the real site that is supposed to be there? I'm highly confused as I thought CURLOPT_FOLLOWLOCATION was supposed to do this for me. [I'm not in safe mode, and I've tried the MaxRedirect Option already, no luck =( ]

    Willing to pay for assistance ;).

    Best,
    - Joe~
     
    x11joex11, Dec 26, 2007 IP
  2. x11joex11

    x11joex11 Peon

    Messages:
    106
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    hmm no one has an idea? A clue even for me =).
     
    x11joex11, Dec 26, 2007 IP
  3. x11joex11

    x11joex11 Peon

    Messages:
    106
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    x11joex11, Dec 26, 2007 IP
  4. x11joex11

    x11joex11 Peon

    Messages:
    106
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Hey good news I figured out the answer on my own >.<, I don't get why this works, but perhaps that site has security, so I changed the headers to make it say something different.

    function getResultFromURL($url)
    {
    	$curl = curl_init();
    
    	$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
    	$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
    	$header[] = "Cache-Control: max-age=0";
    	$header[] = "Connection: keep-alive";
    	$header[] = "Keep-Alive: 300";
    	$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
    	$header[] = "Accept-Language: en-us,en;q=0.5";
    	$header[] = "Pragma: "; // browsers keep this blank.
    	
    	curl_setopt($curl, CURLOPT_URL, $url);
    	curl_setopt($curl, CURLOPT_USERAGENT, 'Googlebot/2.1 (+http://www.google.com/bot.html)');
    	curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
    	curl_setopt($curl, CURLOPT_REFERER, 'http://www.google.com');
    	curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
    	curl_setopt($curl, CURLOPT_AUTOREFERER, true);
    	curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    	curl_setopt($curl, CURLOPT_TIMEOUT, 10);
    	
    	$html = curl_exec($curl); // execute the curl command
    	curl_close($curl); // close the connection
    	
    	return $html; // and finally, return $html
    }
    PHP:
    Found this at php.net and it worked! =), I'm not so sure why, but I guess there is something sneky going on with that site.

    -joe
     
    x11joex11, Dec 26, 2007 IP