Hi, We are in the process of building a cutomised site crawlers. We are quiet successful in building one. But I have a question for the expert coders. Is it possible to fetch last modified data of a page from anywhere if so how is it done?
Assuming you have PHP 5: function filemtime_remote($url, $timestamp = false) { foreach (get_headers($url) AS $header) { if (preg_match('/^Last-Modified:\s*(.+)/i', trim($header), $date)) { return $timestamp ? strtotime($date[1]) : $date[1]; } } return false; } PHP:
Thank you Nico. It is working fine with html sites and the images, but not for php pages. Is there any thing else I could try?
Not sure. Maybe there are no "last modified" headers for PHP pages because they are supposed to be dynamic, and their last modification was when they've been last opened. Not sure if there's a way to achieve this... EDIT: What you could do is save an md5 hash of the source code in a database, and compare it to previous saved hashes. If it's different, then you know more or less when it's been modified for the last time.