Hello, I am still a beginner with PHP and I want some help on this issue, please! I want a script that opens an external URL and extracts all the links from that URL (i.e. an HTML page) then outputs the links to the current document. But the external URL contains relative links so that I want it to convert them to absolute links and also I want to give them custom anchor texts. OK here's an example: The external URL (page) is: "http://external.com/linkpage?pageid=7". This page contains two links: <a href="/photos?ID=5&order=alpha">See more photos here</a>, and <a href="/about.php">About</a> Now I want my script to visit the URL "http://external.com/linkpage?pageid=7", extract the first link from href in the first <a> (link) tag, convert it to absolute by adding "http://external.com" before it (I can give it this, so it doesn't need to guess it), and then outputs this link to my page with an anchor text of my choice. And the same for the second link. So, I want the final output of the script in this example to be like this: <a href="http://external.com/photos?ID=5&order=alpha">I set this anchor text</a>, and <a href="http://external.com/about.php">My anchor text</a>. Any help?
Here's a part of a class that I wrote some time ago. I modified it for your needs, I think. I didn't test it after the changes though: $url should be the link like: /photos?ID=5&order=alpha $fullurl should be the current working directory, like: http://external.com function construct_absolute_url($url, $fullurl) { $url = trim($url); if (preg_match('/^https?:\/\//i', $url)) { return $url; } preg_match('/^https?:\/\/[^\/]+/i', $url, $mainurl); $filename = basename($url); if ($url[0] == '/') { $url = '/' . $filename; } else if (preg_match('/^\.\//', $url) OR $filename == $url) { $url = (substr($url, -1) == '/' ? null : '/') . $filename; } else if (preg_match('/^\.\.\//', $url)) { $dirs = preg_split('/\//', str_replace($mainurl[0] .'/', null, $url), -1, PREG_SPLIT_NO_EMPTY); $dirsback = preg_match_all('/\.\.\//', $url, $dummy); $dirs = array_splice($dirs, $dirsback); $url = '/' . ($dirs ? implode('/', $dirs) . '/' : null) . $filename; } return $mainurl[0] . $url; } PHP:
Thanks for replying, nico_swd! But I don't think your script does what I want to do. It does not fetch the links from an external URL (page) (My first post has more details), does it???
You can do it based on nico_swd's code. But you need check the HTML base tag, if base tag isn't set, use the input url as base url. Then according the base url to get the full url.
OK, I think I've written something. But will it do what I want it to? $page = fopen("http://external.com/linkpage?pageid=7", "r"); $linkStart = explode("href=\"", $page); //Split the page into strings. $linkCount = count($linkStart)-1; //The number of strings indicates the number of href='s (links) found except the first string (normally). for($i=1; $i<=$linkCount; $i++) { ${relLink.$i} = explode("\"", $linkStart); //The first string (element) of each relLink array should contain the relative link. } for($i=1; $i<=$linkCount; $i++) { echo "<a href=\"http://external.com" . ${relLink.$i}[0] . "\">Link #" . $i . "<\/a>"; } PHP:
Can someone please tell me what's wrong with the following code: for($i=1; $i<=5; $i++) { ${string2.$i} = explode("\"", $string1); } for($i=1; $i<=5; $i++) { echo ${relLink.$i}[0]; } I guess the problem is with those variables highlighted in red. What did I do wrong?
If you have PHP5, you can use the DomDocument class $url = 'http://www.google.com'; echo '<h1>URL: '.$url.'</h1>'; $n = new DomDocument(); @$n->loadHTMLFile($url); foreach ($n->getElementsByTagName('a') as $o) { $construct .= '<li><a href="'.$url.$o->getAttribute('href').'">meh anchor</a></li>'; } echo '<ol>' . $construct . '</ol>'; // Final PHP: You can combine it with a form to add your own anchor text