Usefull code, check indexed pages in google, msn and yahoo.

Discussion in 'PHP' started by mkeen, Nov 16, 2005.

  1. #1
    Here is a bit of code I just wrote if anybody wants to use it, its probably a bit sloppy and might need a cleanup.

    It scrapes from google rather than using the API so its upto you if you think this is wrong/bad or not.


    Change the URL's to .com if you prefer, im using .co.uk as im in the UK.

    $domain = "www.google.com";
    $keyword = "site:$domain";
    $keywordbefore = $keyword;
    $keyword = str_replace(" ", "+", $keyword);
    $page =
    file_get_contents("http://www.google.co.uk/search?hl=en&safe=off&q=" . $keyword . "&btnG=Search&meta=");

    $page = str_replace("\n", "", $page); $number = 0;
    if (preg_match("/of about <b>(.*?)</", $page, $matches)) {
    $number = $matches[1]; }
    $number = str_replace(" ", "", $number); $number = str_replace("</b>", "", $number);
    $numbergoogle = str_replace(",", "", $number);

    $page =
    file_get_contents("http://search.yahoo.com/search?p=$keyword");

    $page = str_replace("\n", "", $page); $number = 0;
    if (preg_match("/of about <strong>(.*?) </", $page, $matches)) {
    $number = $matches[1]; }
    $numberyahoo = str_replace(",", "", $number);
    $numberyahoo = str_replace("</strong>", "", $numberyahoo);
    $numberyahoo = str_replace("for", "", $numberyahoo);
    $numberyahoo = str_replace(" ", "", $numberyahoo);
    $page =
    file_get_contents("http://search.msn.co.uk/results.aspx?q=$keyword");
    $page = str_replace("\n", "", $page); $number = 0;
    if (preg_match("/Page 1 of (.*?)results/", $page, $matches)) {
    $number = $matches[1]; }
    $numbermsn = str_replace(",", "", $number);
    $numbermsn = str_replace(" ", "", $numbermsn);
    echo "$domain google: $numbergoogle - Yahoo $numberyahoo - MSN $numbermsn \n";

    You can wrap a while loop around it if your wanting to check multiple domains.

    Here is what the output looks like when I run it for some of my sites.

    www.*****.biz google: 192 - Yahoo 242 - MSN 11
    www.*****.info google: 48 - Yahoo 292 - MSN 7
    www.*****.co.uk google: 114 - Yahoo 509 - MSN 13
    www.*****.info google: 30 - Yahoo 250 - MSN 79
    www.*****.info google: 28 - Yahoo 384 - MSN 11
    www.*****.com google: 34 - Yahoo 338 - MSN 10
    www.*****.info google: 124 - Yahoo 5 - MSN 11
    www.*****.org google: 28 - Yahoo 347 - MSN 13
    www.*****.net google: 11 - Yahoo 1 - MSN 3


    Hope someone finds it usefull.
     
    mkeen, Nov 16, 2005 IP
  2. lucasnet

    lucasnet Peon

    Messages:
    45
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Its very usefull, great script, tnx !
     
    lucasnet, May 7, 2007 IP
  3. monchito

    monchito Active Member

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    86
    #3
    very useful and in perfect working order. It could be used together with a crontab, to check it automatically every week
     
    monchito, Jun 29, 2007 IP