1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

perl command line pr checker

Discussion in 'Programming' started by Shoemoney, Aug 15, 2005.

  1. #1
    so i wrote this a while ago i find it very useful when I do batch checks on domains for pr (more then 5000 at a time) maybe someone will find it useful. just ./pr.pl (domain)

    
    #!/usr/bin/perl
    use LWP::Simple;
    use URI::Escape;
    use integer;
    
    $rank = getrank("$ARGV[0]");
    print "Pagerank for $ARGV[0]: $rank\n";
    sub zeroFill {
       my $z = hex(80000000);
       $a = $_[0];
       $b = $_[1];
    
       if ($z & $a) {
          $a = ($a >> 1);
          $a &= (~$z);
          $a |= 0x40000000;
          $a = ($a>>($b-1));
       } else {
          $a = ($a>>$b);
       }            
       return $a;
    }
    
    sub mix {
       my $a = $_[0];
       my $b = $_[1];
       my $c = $_[2];
       $a -= $b; $a -= $c; $a ^= (zeroFill($c,13));
       $b -= $c; $b -= $a; $b ^= ($a<<8);
       $c -= $a; $c -= $b; $c ^= (zeroFill($b,13));
       $a -= $b; $a -= $c; $a ^= (zeroFill($c,12));
       $b -= $c; $b -= $a; $b ^= ($a<<16);
       $c -= $a; $c -= $b; $c ^= (zeroFill($b,5));
       $a -= $b; $a -= $c; $a ^= (zeroFill($c,3));   
       $b -= $c; $b -= $a; $b ^= ($a<<10);
       $c -= $a; $c -= $b; $c ^= (zeroFill($b,15));
    
       @_ = ($a,$b,$c); 
       return @_;
    }
    sub GoogleCH {
       my @url = @_;
       my $length = 0;
       my $init = 0xE6359A60;
       
       if($length == 0) {
          $length = scalar @_;
       }
    
       my $a = my $b = 0x9E3779B9;
       my $c = $init;
       my $k = 0;
       my $len = $length;   
       while($len >= 12) {
          $a += ($url[$k+0] +($url[$k+1]<<8) +($url[$k+2]<<16) +($url[$k+3]<<24));
          $b += ($url[$k+4] +($url[$k+5]<<8) +($url[$k+6]<<16) +($url[$k+7]<<24));
          $c += ($url[$k+8] +($url[$k+9]<<8) +($url[$k+10]<<16)+($url[$k+11]<<24));     
          @_ = mix($a,$b,$c);
          $a = $_[0]; $b = $_[1]; $c = $_[2];
          $k += 12;
          $len -= 12;
       }
       
       $c += $length;
    
       while($len <= 11 && $len > 0) {
          if($len == 11) { $c+=($url[$k+10]<<24) }
          if($len == 10) { $c+=($url[$k+9]<<16) }
          if($len == 9) { $c+=($url[$k+8]<<8) }
          if($len == 8) { $b+=($url[$k+7]<<24) }
          if($len == 7) { $b+=($url[$k+6]<<16) }
          if($len == 6) { $b+=($url[$k+5]<<8) }
          if($len == 5) { $b+=($url[$k+4]) }
          if($len == 4) { $a+=($url[$k+3]<<24) }
          if($len == 3) { $a+=($url[$k+2]<<16) }
          if($len == 2) { $a+=($url[$k+1]<<8) }
          if($len == 1) { $a+=($url[$k+0]) }
          $len--;
       }
      
       @_ = mix($a,$b,$c);
       return $_[2];
    }
    
    
    sub strord {
       my @result;
       my $string = $_[0];
    
       my @char = split //, $string;
    
       for(my $i=0; $i<length($string); $i++) {
          $result[$i] = ord($char[$i]);
       }
       return @result;
    }
    
    sub getrank {
       my $url = $_[0];
       $url = 'info:'.$url;
    
       my $ch = GoogleCH(strord($url));
    
       my $file = get("http://www.google.com/search?client=navclient-auto&ch=6$ch&features=Rank&q=$url");
       my @rankarray = split(/:|\n/, $file);
       return $rankarray[4];  
    }
    
    sub getPr {
       my $getUrl = $address_widget->get_text;
       my $settext = $pgrankapp->get_widget('rankvalue')->set_text(getrank($getUrl));
       
    }
    
    Code (markup):

     
    Shoemoney, Aug 15, 2005 IP
    Willy likes this.
  2. daed

    daed Peon

    Messages:
    93
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #2
    I can't get the URL in the file to function, and accessing it says my client is forbidden. I wonder if they caught on and closed it down?
     
    daed, Aug 16, 2005 IP
  3. forkqueue

    forkqueue Guest

    Messages:
    401
    Likes Received:
    21
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Works very nicely for me, thanks Shoemoney!

    Nice clean Perl code too - like I always tell people, when Perl is well written it's perfectly readable. Bad code is unreadable whatever the language.
     
    forkqueue, Aug 16, 2005 IP
  4. daed

    daed Peon

    Messages:
    93
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #4
    How are you getting it to work? I get forbidden from just about any IP I try, with a link to the following paragraph:

    No Automated Querying

    You may not send automated queries of any sort to Google's system without express permission in advance from Google. Note that "sending automated queries" includes, among other things:

    * using any software which sends queries to Google to determine how a website or webpage "ranks" on Google for various queries;
    * "meta-searching" Google; and
    * performing "offline" searches on Google.
     
    daed, Aug 16, 2005 IP
  5. Shoemoney

    Shoemoney $

    Messages:
    4,474
    Likes Received:
    588
    Best Answers:
    0
    Trophy Points:
    295
    #5
    daed, it works for me. you == sux ;)

    ..._
    ../\)
    ./ / ~.
    ( Y) [root@devbox(~)]: ./pr.pl digitalpoint.com
    Pagerank for digitalpoint.com: 7
     
    Shoemoney, Aug 16, 2005 IP
  6. Willy

    Willy Peon

    Messages:
    281
    Likes Received:
    25
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Whoa, lots of bitshifts going on there ;) ...gonna have to take a closer look at a better time, but looks interesting, thanks! Is there any documentation/sample code you based it on?
     
    Willy, Aug 16, 2005 IP
  7. Shoemoney

    Shoemoney $

    Messages:
    4,474
    Likes Received:
    588
    Best Answers:
    0
    Trophy Points:
    295
    #7
    ummm nope use it or dont ;)

    I suck at programing. I am sure its unsecure and unstable. I wrote it in a few mins. I just thought I would share what i use.
     
    Shoemoney, Aug 16, 2005 IP
  8. Willy

    Willy Peon

    Messages:
    281
    Likes Received:
    25
    Best Answers:
    0
    Trophy Points:
    0
    #8
    You misunderstood me :) It just wasn't immediately obvious to me what the GoogleCH sub does, so I was wondering whether that's some special key algorithm that might be published in some Google paper somewhere... So if anything, that's only a reflection on my ability to (mis)understand Perl :D

    Inquiring minds want to know ;)
     
    Willy, Aug 16, 2005 IP
  9. Shoemoney

    Shoemoney $

    Messages:
    4,474
    Likes Received:
    588
    Best Answers:
    0
    Trophy Points:
    295
    #9
    no no im saying I really suck.. seriously... I got ideas from the perl cookbook then searched the net to compile that code above. I use it constantly but its possible its unstable or insecure. I just wanted to state that for the record.
     
    Shoemoney, Aug 16, 2005 IP
  10. daed

    daed Peon

    Messages:
    93
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Yeah.. my copy of perl sucks. =[ I can get it to work on other ones. What version of perl are you running?
     
    daed, Aug 17, 2005 IP
  11. Shoemoney

    Shoemoney $

    Messages:
    4,474
    Likes Received:
    588
    Best Answers:
    0
    Trophy Points:
    295
    #11
    whatever is debian stable ;)

    root@devbox:/home/shamu/public_html# perl -v

    This is perl, v5.8.4 built for i386-linux-thread-multi
     
    Shoemoney, Aug 17, 2005 IP
  12. script909

    script909 Peon

    Messages:
    33
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #12
    does it recuires google-licence?
     
    script909, Aug 18, 2005 IP
  13. Shoemoney

    Shoemoney $

    Messages:
    4,474
    Likes Received:
    588
    Best Answers:
    0
    Trophy Points:
    295
    #13
    no it does not
     
    Shoemoney, Aug 18, 2005 IP
  14. script909

    script909 Peon

    Messages:
    33
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #14
    pretty thing
    respect
     
    script909, Aug 18, 2005 IP
  15. Shoemoney

    Shoemoney $

    Messages:
    4,474
    Likes Received:
    588
    Best Answers:
    0
    Trophy Points:
    295
    #15
    thanks hope you find it useful
     
    Shoemoney, Aug 18, 2005 IP
  16. flash_f

    flash_f Peon

    Messages:
    21
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #16
    wow, itss nice and cool
     
    flash_f, Dec 9, 2005 IP
  17. jbw

    jbw Peon

    Messages:
    343
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #17
    Old thead it seems, but for the person asking, that bitshiffting and such looks like the Burtle Hash.
     
    jbw, Dec 10, 2005 IP
  18. Will.Spencer

    Will.Spencer NetBuilder

    Messages:
    14,789
    Likes Received:
    1,040
    Best Answers:
    0
    Trophy Points:
    375
    #18
    Yep, Google appears to have copied Bob Jenkins' code.
     
    Will.Spencer, Jan 10, 2006 IP
  19. JRBHosting

    JRBHosting Peon

    Messages:
    121
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #19
    You know, I hate bringing up dead topics, but this one has a legitimate purpose :)

    Just for everybody's information, this code as of this post is NOT working. I presume this is due to Google's checksum, etc. changes.

    If anybody could change this code so that it works...that would be great. If not, its not as if I lost anything :p

    Jason
     
    JRBHosting, Mar 24, 2007 IP
  20. sea otter

    sea otter Peon

    Messages:
    250
    Likes Received:
    23
    Best Answers:
    0
    Trophy Points:
    0
    #20
    The CPAN PageRank module works fine; just tested it (again) two minutes ago.

    The module is WWW::Google::pageRank

    Just install it from within CPAN.

    Here's a quick sample script based on it to check PR (save it to a file called pr.pl, and make sure you chmod a+x it before running):

    
    #!/usr/bin/perl
    
    use locale;
    use WWW::Google::PageRank;
    
    
    my $pr = WWW::Google::PageRank->new();
    
    my $rank = $pr->get(shift);
    
    print "PageRank: " . $rank . "\n";
    
    Code (markup):
    For example:

    
    ./pr.pl http://digitalpoint.com
    
    Code (markup):
    You should of course modify the above to verify/sanitize the input, add missing http:// etc.

    As an aside, there are known problems with the freely available PHP pagerank scripts, related to some esoteric problems with the bitshift operator and certain versions of gcc. I now always use a perl PR checker based on the above CPAN module, which I call from within PHP.

    This setup has been working for years, never a problem.
     
    sea otter, Mar 24, 2007 IP
    Will.Spencer likes this.