Is there a tool where I can dump in a list of websites and check the amount of outgoing links on each? I have a list of blog posts to comment on but I don't even want to bother if they have over 50 outgoing links already. So I want to be able to dump all my URLs in a textbox and retrieve the # of non-domain outgoing links. Any ideas? Does something like this exist or can it be easily made in PHP?
That's perfect but is there one that does it for bulk... I have hundreds of URLs I want to just dump in.
Hi, Attached is snippet of code i use for my SEO related tools.. It takes web page content & the domain as input and returns array of outbound links: function get_outbound_links($data, $url = '') { $pattern = array('/^http:\/\//', '/^www\./'); $replace = array('', ''); $uri = preg_replace($pattern, $replace, $url); $uri_orig = 'http://' . $uri; $uri_www = 'http://www.' . $uri; $preg_quote_1 = '$^' . preg_quote($uri_orig) . '$is'; $preg_quote_2 = '$^' . preg_quote($uri_www) . '$is'; $preg_quote_3 = '$^' . preg_quote('http://') . '$is'; $links = array(); preg_match_all("/<a[^>]+href=([\"']?)([^\\s\"'>]+)\\1/is", $data, $matches_anchor, PREG_SET_ORDER); preg_match_all("/<a[^>]+href=([\"']?)([^\\s\"'>]+)\\1/is", $data, $matches_frame, PREG_SET_ORDER); preg_match_all("/<a[^>]+href=([\"']?)([^\\s\"'>]+)\\1/is", $data, $matches_area, PREG_SET_ORDER); $matches = array_merge($matches_anchor, $matches_frame, $matches_area); foreach($matches as $id => $match) { if(!preg_match($preg_quote_1, $match[2]) && !preg_match($preg_quote_2, $match[2]) && preg_match($preg_quote_3, $match[2])) $links[$match[2]] = 1; } return array('outbound_links' => $links); } PHP: You can modify it to accept only url and then read the content of url and do the processing..
You can just do preg_match_all() on all the link tags and then use count() to count the number of links. Which is more simpler.
hate to bump an old thread up, but im looking for exactly the same thing here. The above code seems more for listing the URLs rather than just getting a gross count per URL. Any help with this would be appreciated