Hi, I am helping a good friend start a new site (which will probably be out in a year or so) and he wants it so that any links on his website that are to pages that aren't on his site get automatically cached and on the "exit bar" which will appear above all pages when visited (if external) will have a "view cached page" option... Thanks, Dom
What he wants is a spider - a script that scrapes the page and searches for external URLs and gets the output of the external pages by using cURL or simple file handling functions. Save these with a naming convention or use a database table matching URLs and locations of cached pages to easily retrieve them when needed.
Thanks! So he needs a spider to crawl the pages in his site, finding all the external URLs and then tell cURL to cache them? Then name them a certain way (according to the settings). Is there a simpler way to do this? How much do you think an expert would charge?
Well I could do this for you/him for about $10-$20, as many like myself will have code for a spider and all that needs to be done is modify it for your needs, e.g. providing an interface to access cached pages.