picking out phrases using REGEX

Discussion in 'PHP' started by gfreeman, Jun 8, 2007.

  1. #1
    Hi all,

    I have a really long string called $html which contains a whole web page from which I want to grab some info.

    The format is sort of like this:

    blahblahblah
    <tag>info-1</tag>
    <tag>info-2</tag>
    <tag>info-3</tag>
    tum-te-tum-te-tum
    <tag>info-4</tag>
    <tag>info-5</tag>
    foobarfoobar
    Code (markup):
    I want 2 arrays - one array that contains the info-n between blahblah and tum-te-tum, and another array that contains the info-n between tum-te-tum and foobar.

    Can a master of regex show me how this is done please?!

    Thanks!!
     
    gfreeman, Jun 8, 2007 IP
  2. krakjoe

    krakjoe Well-Known Member

    Messages:
    1,795
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    135
    #2
    regex is a pretty specific thing to use, please post a link to the actual page you want data from and point out the actual data you want.
     
    krakjoe, Jun 8, 2007 IP
  3. gfreeman

    gfreeman Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    The page is at:
    http://gfreeman.com/files/regex.zip

    In it you see about 1/4 of the way down a line that says "This team has visited the following leagues for international training matches" followed by a list of images of flags. I need an array containing the numbers prefixing all the flag.gifs

    Again, about halfway down, there is a line that says "This team has been visited by teams from the following leagues for international training matches" and I need another array containing the numbers prefixing all those flag.gifs

    If this helps - then a HUGE thanks!
     
    gfreeman, Jun 8, 2007 IP
  4. krakjoe

    krakjoe Well-Known Member

    Messages:
    1,795
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    135
    #4
    
    <?
    function retrieve_data( $location )
    {
    	if( !@fopen( $location, 'r' ) )
    	{
    		printf("Cannot open %s", $location );	
    	}
    	else
    	{
    		$buffer = file_get_contents( $location ) ;
    		if( preg_match('~This team has visited the following leagues for international training matches:(.*?)This team has been visited by teams from the following leagues for international training matches:~si', $buffer, $leagues ) )
    		{
    			foreach( split( "\n", $leagues[1] ) as $line )
    			{
    				if( preg_match('~SRC="/Common/images/([0-9]+)flag.gif"~', $line, $num ) )
    				{
    					$returns['visited'][ ] = (int) $num[1];	
    				}
    			}
    		}
    		if( preg_match( '~This team has been visited by teams from the following leagues for international training matches(.*?)<H2>International Players</H2><BR>~si', $buffer, $visitedby ) )
    		{
    			foreach( split( "\n", $visitedby[1] ) as $line )
    			{
    				if( preg_match('~SRC="/Common/images/([0-9]+)flag.gif"~', $line, $num ) )
    				{
    					$returns['visitedvy'][ ] = (int) $num[1];	
    				}	
    			}	
    		}
    	}
    	return $returns ;
    }
    echo "<pre>";
    print_r( retrieve_data('test.html') );
    ?>
    
    PHP:
    Your arrays ......
     
    krakjoe, Jun 8, 2007 IP
  5. gfreeman

    gfreeman Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    1- That's some excellent coding!
    2- It doesn't break my current php when I change it slightly and embed it
    3- Alas it returns zero lines, but that's probably more to do with my tweaks than anything.
    I'll check further when I get home this weekend, but you've probably cracked this for me!

    Thanks!!!
     
    gfreeman, Jun 8, 2007 IP
  6. gfreeman

    gfreeman Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    OK, all working perfectly thanks.

    That was a great solution!
     
    gfreeman, Jun 13, 2007 IP