pagerank script for multiple domain not working in loop

Discussion in 'PHP' started by rishirajsingh, Aug 31, 2007.

  1. #1
    I got the script of finding pagerank of any webiste or url.
    I modified this script and made it to find pagerank for multiple websites.
    the script is working fine for multiple url but it displaying pagerank of only last url.
    I guess, i am making some silly mistakes but not able to find it.
    Please help. I have posted the code below

    
    <?php
    
    // I am testing script on server http://www.indiamarketinggroup.com/client/pagerank.php
    //PageRank Lookup v1.1 by HM2K (update: 31/01/07)
    
    if ((!isset($_POST['urls'])) && (!isset($_GET['urls'])))
    { echo '<center><form action="" method="post"><textarea name="urls" cols="36" rows="9"></textarea><br><input type="submit" name="Submit" value="enter multiple url on new lines and click here"></form></center>'; }
    if (isset($_POST['urls'])) 
    {  
    	$tempurls=$_POST['urls'];
    	$pieces = explode("\n", $tempurls); // finding all submitted urls
    	$size=sizeof($pieces);		// array size to use in the for loop at the end of program
    	
    //settings - host and user agent
    $googlehost='toolbarqueries.google.com';
    $googleua='Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.6) Gecko/20060728 Firefox/1.5';
    
    //convert a string to a 32-bit integer
    function StrToNum($Str, $Check, $Magic) {
        $Int32Unit = 4294967296;  // 2^32
    
        $length = strlen($Str);
        for ($i = 0; $i < $length; $i++) {
            $Check *= $Magic; 	
            //If the float is beyond the boundaries of integer (usually +/- 2.15e+9 = 2^31), 
            //  the result of converting to integer is undefined
            //  refer to http://www.php.net/manual/en/language.types.integer.php
            if ($Check >= $Int32Unit) {
                $Check = ($Check - $Int32Unit * (int) ($Check / $Int32Unit));
                //if the check less than -2^31
                $Check = ($Check < -2147483648) ? ($Check + $Int32Unit) : $Check;
            }
            $Check += ord($Str{$i}); 
        }
        return $Check;
    }
    
    //genearate a hash for a url
    function HashURL($String) {
        $Check1 = StrToNum($String, 0x1505, 0x21);
        $Check2 = StrToNum($String, 0, 0x1003F);
    
        $Check1 >>= 2; 	
        $Check1 = (($Check1 >> 4) & 0x3FFFFC0 ) | ($Check1 & 0x3F);
        $Check1 = (($Check1 >> 4) & 0x3FFC00 ) | ($Check1 & 0x3FF);
        $Check1 = (($Check1 >> 4) & 0x3C000 ) | ($Check1 & 0x3FFF);	
    	
        $T1 = (((($Check1 & 0x3C0) << 4) | ($Check1 & 0x3C)) <<2 ) | ($Check2 & 0xF0F );
        $T2 = (((($Check1 & 0xFFFFC000) << 4) | ($Check1 & 0x3C00)) << 0xA) | ($Check2 & 0xF0F0000 );
    	
        return ($T1 | $T2);
    }
    
    //genearate a checksum for the hash string
    function CheckHash($Hashnum) {
        $CheckByte = 0;
        $Flag = 0;
    
        $HashStr = sprintf('%u', $Hashnum) ;
        $length = strlen($HashStr);
    	
        for ($i = $length - 1;  $i >= 0;  $i --) {
            $Re = $HashStr{$i};
            if (1 === ($Flag % 2)) {              
                $Re += $Re;     
                $Re = (int)($Re / 10) + ($Re % 10);
            }
            $CheckByte += $Re;
            $Flag ++;	
        }
    
        $CheckByte %= 10;
        if (0 !== $CheckByte) {
            $CheckByte = 10 - $CheckByte;
            if (1 === ($Flag % 2) ) {
                if (1 === ($CheckByte % 2)) {
                    $CheckByte += 9;
                }
                $CheckByte >>= 1;
            }
        }
    
        return '7'.$CheckByte.$HashStr;
    }
    
    //return the pagerank checksum hash
    function getch($url) { return CheckHash(HashURL($url)); }
    
    //return the pagerank figure
    function getpr($url) {
    	global $googlehost,$googleua;
    	$ch = getch($url);
    	$fp = fsockopen($googlehost, 80, $errno, $errstr, 30);
    	if ($fp) {
    	   $out = "GET /search?client=navclient-auto&ch=$ch&features=Rank&q=info:$url HTTP/1.1\r\n";
    	   //echo "<pre>$out</pre>\n"; //debug only
    	   $out .= "User-Agent: $googleua\r\n";
    	   $out .= "Host: $googlehost\r\n";
    	   $out .= "Connection: Close\r\n\r\n";
    	
    	   fwrite($fp, $out);
    	   
    	   //$pagerank = substr(fgets($fp, 128), 4); //debug only
    	   //echo $pagerank; //debug only
    	   while (!feof($fp)) {
    			$data = fgets($fp, 128);
    			//echo $data;
    			$pos = strpos($data, "Rank_");
    			if($pos === false){} else{
    				$pr=substr($data, $pos + 9);
    				$pr=trim($pr);
    				$pr=str_replace("\n",'',$pr);
    				return $pr;
    			}
    	   }
    	   //else { echo "$errstr ($errno)<br />\n"; } //debug only
    	   fclose($fp);
    	}
    }
    
    //generate the graphical pagerank
    function pagerank($url,$width=40,$method='style') {
    	if (!preg_match('/^(http:\/\/)?([^\/]+)/i', $url)) { $url='http://'.$url; }
    	$pr=getpr($url);
    	$pagerank="PageRank: $pr/10";
    
    	//The (old) image method
    	if ($method == 'image') {
    	$prpos=$width*$pr/10;
    	$prneg=$width-$prpos;
    	$html='<img src="http://www.google.com/images/pos.gif" width='.$prpos.' height=4 border=0 alt="'.$pagerank.'"><img src="http://www.google.com/images/neg.gif" width='.$prneg.' height=4 border=0 alt="'.$pagerank.'">';
    	}
    	//The pre-styled method
    	if ($method == 'style') {
    	$prpercent=100*$pr/10;
    	$html='<div style="position: relative; width: '.$width.'px; padding: 0; background: #D9D9D9;"><strong style="width: '.$prpercent.'%; display: block; position: relative; background: #5EAA5E; text-align: center; color: #333; height: 4px; line-height: 4px;"><span></span></strong></div>';
    	}
    	
    	$out='<a href="'.$url.'" title="'.$pagerank.'">'.$html.'</a>';
    	return $out." PageRank = ".$pr." /10<br>";
    }
    
    // for loop for finding pagerank of multple urls
    		for ($j=0; $j<$size; $j++)
    		{		
    		$url=$pieces[$j];
    		echo $pieces[$j].pagerank($pieces[$j]);
    		}
    
    }
    ?>
    
    Code (markup):
     
    rishirajsingh, Aug 31, 2007 IP
  2. sea otter

    sea otter Peon

    Messages:
    250
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Except for the last url you explode() from the textarea box, each url you pass to the pagerank function has an extra trailing space character.

    Replace the for loop at the bottom of your code with this one. Tested and works.

    
    // for loop for finding pagerank of multple urls
    		foreach ($pieces as $url)
    		{
    			$url = trim($url);
    			echo $url, pagerank($url);
    		}
    
    PHP:
     
    sea otter, Aug 31, 2007 IP
  3. Brandon Sheley

    Brandon Sheley Illustrious Member

    Messages:
    9,721
    Likes Received:
    612
    Best Answers:
    2
    Trophy Points:
    420
    #3
    the demo didn't work for my forum ? it's a 5 and it shows a 0/10
     
    Brandon Sheley, Aug 31, 2007 IP
  4. sea otter

    sea otter Peon

    Messages:
    250
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #4
    There's a known problem with bitshifts in PHP compiled with certain versions of GCC on certain platforms. This results in errors in much of the math used to calculate PR, so you end up with an invalid hash and hence a PR of 0 (or, rather, undefined).

    I ran into this a few times and ended up simply delegating to the CPAN Perl PageRank module, which has never been a problem anywhere.

    This might be the source of your problem.
     
    sea otter, Aug 31, 2007 IP
  5. gopher292

    gopher292 Peon

    Messages:
    64
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Do you mind if I use this code in my own projects?
     
    gopher292, Sep 1, 2007 IP
  6. rishirajsingh

    rishirajsingh Banned

    Messages:
    286
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #6
    you can use it for your personal use but not to sell or keep it as online tool.
    PM me if you want to make any online tool using it or any other online projects for which you want to use it?
     
    rishirajsingh, Sep 6, 2007 IP
  7. kendhin

    kendhin Peon

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    I have try the script but it didn't work. Is it because of my web hosting?. I use 3 different webhosting (freehostia, 3host.biz and geocities). It shows different result. Freehostia show warning "fsockopen(): unable to connect to toolbarqueries.google.com:80 in /home/www/......". 3host.biz didn't show warning but no result too.
     
    kendhin, Dec 6, 2007 IP
  8. lonlygurl

    lonlygurl Peon

    Messages:
    64
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #8
    lonlygurl, Jul 17, 2008 IP