preg_match

Discussion in 'PHP' started by Savo, Dec 19, 2008.

  1. #1
    Hi,

    I have just started learning php and am stuck on preg_match i have been able to use it to get scrap bits of webpages but now i want to scrap all data between two words but cant seem to work out how.

    I am trying to scrape my google rank and need to get all data from $url to clnk',''[0-9]

    i have been able to get them separately but this makes me all ways rank 1.

    Thanks for any help

    Sav
     
    Savo, Dec 19, 2008 IP
  2. Danltn

    Danltn Well-Known Member

    Messages:
    679
    Likes Received:
    36
    Best Answers:
    0
    Trophy Points:
    120
    #2
    Please confirm, are you trying to get your Pagerank, or your SERP?

    If you can give more information (esp. URLs) I can almost definitely help. RegEx is a personal strong point.

    Dan.
     
    Danltn, Dec 19, 2008 IP
  3. dcwilliam1

    dcwilliam1 Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    gah! I hope your not talking about regular expressions. I have spent hours upon hours trying to figure out regular expressions. I have a book I read dedicated specifically to regular expressions.
    Even after I walk out thinking I have a full working knowledge, - I still end up doing it like a random hit or miss process.
    Good luck. that stuff is not easy even for some experts.
     
    dcwilliam1, Dec 19, 2008 IP
  4. Danltn

    Danltn Well-Known Member

    Messages:
    679
    Likes Received:
    36
    Best Answers:
    0
    Trophy Points:
    120
    #4
    If you go through them in a logical way, separating them into parts, they're an awful lot easier than you might think. And I've learnt without a book, or even a solid website.

    Dan
     
    Danltn, Dec 19, 2008 IP
  5. Savo

    Savo Peon

    Messages:
    157
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #5

    i need to be able to get the rank that is listed after the url

    i have used
    
    //this checks to see if the url is listed
    (preg_match($KeywordsEscaped, $isurllisted, $match))
    
    //This will then return the rank
    (preg_match("/$match clnk','[0-9]/", $isurllisted, $match));
    
    PHP:
    The only problem with this is that it pulls the first rank so i am rank 1 for every keyword, i was thinking if i could get all the data between the url and the rank then i would have a new string with only the correct rank which the second preg_match would return.

    Thanks again

    Sav

    here is the source from google
    
    www.costacomms.com/+costacomms&hl=en&ct=clnk&cd=1&gl=uk&client=firefox-a" onmousedown="return rwt(this,'','','clnk','1','
    HTML:
     
    Savo, Dec 20, 2008 IP
  6. Savo

    Savo Peon

    Messages:
    157
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #6
    It sucks beeing a newbie, i had it sorted last night but because i missed a $ sign i was looking for "/urltocheck(.*?)clnk','[0-9]/"; and not the url.

    Here is the working code. If anyone has any improvments or can direct me to what i should be reading to do it more eficently please let me know. I am going to go look at curl now insted of file_get_contents.

    
    $urltocheck = "/$urltocheck(.*?)clnk','[0-9]/";
    // check google is not disalowing us
    if (preg_match("/Your client does not have permission to get URL/", $googlerank)){
    	 echo "faild";
    	}
    	
    	// check to see if the url is listed in the search results 
    elseif (preg_match($urltocheck, $googlerank, $match)){
    
    		(preg_match("/[0-9]/", $match[0], $match));
    	echo "<br><br>rank = $match[0]";
    } else {
        echo "A match was not found.";
    
    
    PHP:

    this only works for the first 9 results, i have fixed this in my copy
     
    Savo, Dec 20, 2008 IP