Counting links Hello, I would like to count all: 1. Internal Links and all: 2. Outbound Links where $url="mypage.com/about.html"; $pageHTML = file_get_contents('http://'.$url); So: Forexample; http://whois.domaintools.com/digitalpoint.com Links 7 (Internal: 7, Outbound: 0) <a href="contact.html">contact us</a> is an internal link. <a href="http://www.mypage.com/info.html">contact us</a> is an internal link. <a href="http://mypage.com/hello.html">hello world</a> is an internal link. <a href="https://secure.mypage.com/hello.html">hello world</a> is an internal link. <a href="http://www.google.com/">google</a> is an external link. <a href="https://secure.google.com/">google</a> is an external link. <a href="javascript:dosomething();">JS</a> is NOT a link. Any ideas? Thanks!
Try somthing like this: <?php function parseLinks( $url ) { $int = array(); $out = array(); $content = @file_get_contents( $url ); if( $content ) { $urlParts = @parse_url( $link ); $curHost = preg_replace("/^ww.\./ims", '', $urlParts['host']); preg_match_all("/<a.*?href\s*=\s*['\"]?([^\s>'\"]+)['\"]?/ims", $content, $matches, PREG_PATTERN_ORDER ); foreach ( $matches[1] as $link ) { $link = trim( $link ); if( $link[0] == '#' ) { continue; } $urlParts = @parse_url( $link ); if( @$urlParts['scheme'] && stripos( $urlParts['scheme'], 'http' ) === false ) { continue; } if( $urlParts ) { if ( @$urlParts['host'] && stripos( $urlParts['host'], $curHost ) === false ) { $out[] = $link; } else { $int[] = $link; } } } return array( 'int' => $int, 'out' => $out ); } return false; } $res = parseLinks('http://forums.digitalpoint.com/showthread.php?p=10034063&posted=1#post10034063'); echo "Internal: ".count( $res['int'] )."<br />"; echo "Outbound: ".count( $res['out'] )."<br />"; ?> PHP:
thank you its working.. perfect solutions. but I want dont view link to google adsense advertisement links thank you..
caykoylu, Google adsense links are generated by Java Script. So, you can't see them in html source code.