Testing remote sites for redirection with PHP

Discussion in 'PHP' started by liamvictor, Jul 5, 2007.

  1. #1
    I've a whole bunch of sites / URLs in a database that I want to parse through and find any bad links to drop, and also any that are now being redirected.

    So, if I hit an article from abc.com/article and it gets 301 (or whatever) to xyz.com I want to know where it's being redirected too.

    Fopen will error on 404 and follow other redirects, but I can't think as to how to see if I am being redirected, and if so, what to.

    So, any clues as to how to?
     
    liamvictor, Jul 5, 2007 IP
  2. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #2
    
    function get_final_location($url)
    {
    	$headers = @get_headers($url);
    	
    	foreach ((array)$headers AS $header)
    	{
    		if (preg_match('/Location\s*:\s*(https?:[^;\s\n\r]+)/i', $header, $redirect))
    		{
    			return get_final_location($redirect[1]);
    		}
    	}
    	
    	return $url;
    }
    
    PHP:
    This gets the final URL, even if the page is redirected multiple times.

    Usage example:
    
    echo get_final_location('http://www.hotmail.com');
    
    PHP:
    If you don't have PHP 5, you have to define get_headers() yourself. You can use this for example.
    
    if (!function_exists('get_headers'))
    {
    	function get_headers($url)
    	{
    		@extract(@parse_url($url));
    		$scheme = (isset($scheme) AND $scheme == 'https') ? 'ssl://' : '';
    		
    		if (!isset($port) OR empty($port))
    		{
    			$port = empty($scheme) ? 80 : 443;
    		}
    	
    		if ($fp = @fsockopen($scheme . $host, $port, $errno, $errstr, 30))
    		{
    			fwrite($fp,
    				"HEAD {$url} HTTP/1.1\r\n" .
    				"HOST: {$host}\r\n" .
    				"Connection: close\r\n\r\n"
    			);
    			
    			return explode("\n", fgets($fp));
    		}
    		
    		return false;
    	}
    }
    
    
    PHP:
     
    nico_swd, Jul 6, 2007 IP
    liamvictor likes this.
  3. liamvictor

    liamvictor Peon

    Messages:
    127
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Fantastic help, thank you.

    I'd never seen the get_headers function before, and searching php.net didn't return anything useful.
     
    liamvictor, Jul 6, 2007 IP