I've a whole bunch of sites / URLs in a database that I want to parse through and find any bad links to drop, and also any that are now being redirected. So, if I hit an article from abc.com/article and it gets 301 (or whatever) to xyz.com I want to know where it's being redirected too. Fopen will error on 404 and follow other redirects, but I can't think as to how to see if I am being redirected, and if so, what to. So, any clues as to how to?
function get_final_location($url) { $headers = @get_headers($url); foreach ((array)$headers AS $header) { if (preg_match('/Location\s*:\s*(https?:[^;\s\n\r]+)/i', $header, $redirect)) { return get_final_location($redirect[1]); } } return $url; } PHP: This gets the final URL, even if the page is redirected multiple times. Usage example: echo get_final_location('http://www.hotmail.com'); PHP: If you don't have PHP 5, you have to define get_headers() yourself. You can use this for example. if (!function_exists('get_headers')) { function get_headers($url) { @extract(@parse_url($url)); $scheme = (isset($scheme) AND $scheme == 'https') ? 'ssl://' : ''; if (!isset($port) OR empty($port)) { $port = empty($scheme) ? 80 : 443; } if ($fp = @fsockopen($scheme . $host, $port, $errno, $errstr, 30)) { fwrite($fp, "HEAD {$url} HTTP/1.1\r\n" . "HOST: {$host}\r\n" . "Connection: close\r\n\r\n" ); return explode("\n", fgets($fp)); } return false; } } PHP:
Fantastic help, thank you. I'd never seen the get_headers function before, and searching php.net didn't return anything useful.