1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

preg_match_all class

Discussion in 'PHP' started by gilgalbiblewheel, Jul 29, 2009.

  1. #1
    I need to know the formula to find all where the class attribute is equal to g-pb25
    There must be something wrong with this that it doesn't work:
    preg_match_all('|CLASS="g-pb25">|U', $contents_of_page, $classout, PREG_SET_ORDER);
    PHP:

     
    gilgalbiblewheel, Jul 29, 2009 IP
  2. Leron

    Leron Active Member

    Messages:
    38
    Likes Received:
    1
    Best Answers:
    1
    Trophy Points:
    53
    #2
    I'm a bit confused. What exactly are you trying to achieve here? Can you provide the rest of the code that you have?
     
    Leron, Jul 29, 2009 IP
  3. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #3
    I need to know what's in the tag where class is g-pb25
     
    gilgalbiblewheel, Jul 29, 2009 IP
  4. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #4
    Ok. I have two preg_match_alls. The one that works is blocked with //. The other, for some reason, doesn't work:
    	//CLASS
    echo "<table style=\"border: 1px solid black;\">
    	<tr>
    		<td>";
    	preg_match_all('#class=\"g-pb25\">(.+?)<#Ui', $contents_of_page, $classout, PREG_SET_ORDER);
    	//print_r($classout)."\n";
    	/*
    	foreach ($classout as $val) {
        echo "matched: " . $val[0] . "\n";
        echo "part 1: " . $val[1] . "\n";
        echo "part 2: " . $val[3] . "\n";
        echo "part 3: " . $val[4] . "\n\n";
    	}
    	*/
    echo "		</td>
    	</tr>
    </table>";
    echo "<table style=\"border: 1px solid red;\">
    	<tr>
    		<td>";
    	preg_match_all('#class=\"g-asm\">(.+?)<#Ui', $contents_of_page, $classgasm, PREG_SET_ORDER);
    	print_r($classgasm)."\n";
    foreach ($classgasm as $val) {
        echo "<span style:\"color: red;\">matched: " . $val[0] . "</span>\n";
        echo "part 1: " . $val[1] . "\n";
        echo "part 2: " . $val[3] . "\n";
        echo "part 3: " . $val[4] . "\n\n";
    	}
    echo "		</td>
    	</tr>
    </table>";
    PHP:
    The foreach doesn't work either.
     
    gilgalbiblewheel, Jul 29, 2009 IP
  5. Leron

    Leron Active Member

    Messages:
    38
    Likes Received:
    1
    Best Answers:
    1
    Trophy Points:
    53
    #5
    Firstly I don't know what your content looks like, but add this to the top of
    your script
    
    $contents_of_page = '<span CLASS="g-pb25">this is the content</span> 
    <span CLASS="g-asm">this is the other content</span>';
    
    Code (markup):
    both of your syntax are correct, so I think its where you are grabbing the content for your "$contents_of_page" information from.

    Let me know how it turns out with the test string above :D
     
    Leron, Jul 29, 2009 IP
  6. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #6
    	$contents_of_page = '<span CLASS="mbg">this is the content</span><span CLASS="g-asm">this is the other content</span>';
    
    echo "<table style=\"border: 1px solid black;\">
    	<tr>
    		<td>";
    
    	preg_match_all('#class=\"mbg\">(.+?)<#Ui', $contents_of_page, $classmbg, PREG_SET_ORDER);
    	print_r($classmbg)."\n\n";
    echo "		</td>
    	</tr>
    </table>";
    echo "<table style=\"border: 1px solid red;\">
    	<tr>
    		<td>";
    	preg_match_all('#class=\"g-asm\">(.+?)<#Ui', $contents_of_page, $classgasm, PREG_SET_ORDER);
    	print_r($classgasm)."\n";
    
    	echo "		</td>
    	</tr>
    </table>";
    PHP:
    It seems to be ignoring the selected tags.It's looking for the contents of every tag.
     
    gilgalbiblewheel, Jul 29, 2009 IP
  7. jamespv85

    jamespv85 Peon

    Messages:
    238
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #7
    This would work too:

    CLASS="g-pb25">(.+)<\/

    the one in (.*) should get you what's inside CLASS="g-pb25", although this will not span multiple lines
     
    jamespv85, Jul 29, 2009 IP
  8. Leron

    Leron Active Member

    Messages:
    38
    Likes Received:
    1
    Best Answers:
    1
    Trophy Points:
    53
    #8
    Get rid of the U modifier you used. I think you meant to use the lowercase u modifier for UTF-8 compatibility, but if not just remove the ? from inside your grouping

    So change:
    preg_match_all('#class="mbg">(.+?)<#Ui', $contents_of_page, $classmbg, PREG_SET_ORDER);
    
    To:
    preg_match_all('#class="mbg">(.+?)<#ui', $contents_of_page, $classmbg, PREG_SET_ORDER);
    
    or this:
    preg_match_all('#class="mbg">(.+)<#Ui', $contents_of_page, $classmbg, PREG_SET_ORDER);
    
    and this:
    preg_match_all('#class=\"g-asm\">(.+?)<#Ui', $contents_of_page, $classgasm, PREG_SET_ORDER);
    
    to this:
    preg_match_all('#class=\"g-asm\">(.+?)<#ui', $contents_of_page, $classgasm, PREG_SET_ORDER);
    
    or this:
    preg_match_all('#class=\"g-asm\">(.+)<#Ui', $contents_of_page, $classgasm, PREG_SET_ORDER);
    Code (markup):
    Just a matter of being greedy I guess :D
     
    Leron, Jul 30, 2009 IP
  9. keaglez

    keaglez Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    
    preg_match_all('#<(\w+) +class="mbg">(.+?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);
    
    PHP:
    Try this... :)
     
    keaglez, Jul 30, 2009 IP
  10. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #10
    I wonder why it's doing in two arrays (one array within another):
    Array
    (
        [0] =&gt; Array
            (
                [0] =&gt; class="mbg"&gt;<a href="http://.../wrath_of_the_lamb_rev616">wrath_of_the_lamb_rev616&lt;
    
                [1] =&gt; </a><a href="http://.../wrath_of_the_lamb_rev616">wrath_of_the_lamb_rev616
            )
    
        [1] =&gt; Array
            (
                [0] =&gt; class="mbg"&gt;</a><a href="http://.../80s_toyz/">80s_toyz&lt;
                [1] =&gt; </a><a href="http://.../80s_toyz/">80s_toyz
            )
    
        [2] =&gt; Array
            (
                [0] =&gt; class="mbg"&gt;</a><a href="http://.../80s_toyz/">80s_toyz&lt;
    
                [1] =&gt; </a><a href="http://.../80s_toyz/">80s_toyz
            )
    
        [3] =&gt; Array
            (
                [0] =&gt; class="mbg"&gt;</a><a href="http://.../80s_toyz/">80s_toyz&lt;
                [1] =&gt; </a><a href="http://.../80s_toyz/">80s_toyz
            )
    
    Code (markup):
    From this php code:
    	$contents_of_page = file_get_contents($booklink);
    	//$contents_of_page = '<span class="mbg">this is the content</span><span class="g-asm">This is the other content</span>';
    
    echo "<table style=\"border: 1px solid black;\">
    	<tr>
    		<td>";
    
    	preg_match_all('#class=\"mbg\">(.+?)<#i', $contents_of_page, $classmbg, PREG_SET_ORDER);
    	print_r($classmbg)."\n";
    echo "		</td>
    	</tr>
    </table>";
    echo "<table style=\"border: 1px solid red;\">
    	<tr>
    		<td>";
    	preg_match_all('#class=\"g-asm\">(.+?)<#i', $contents_of_page, $classgasm, PREG_SET_ORDER);
    	print_r($classgasm)."\n";
    
    	echo "		</td>
    	</tr>
    </table>";
    PHP:
     
    gilgalbiblewheel, Jul 30, 2009 IP
  11. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #11
    yours worked out:
    	$contents_of_page = file_get_contents($booklink);
    	//$contents_of_page = '<span class="mbg">this is the content</span><span class="g-asm">This is the other content</span>';
    
    echo "<table style=\"border: 1px solid black;\">
    	<tr>
    		<td>";
    preg_match_all('#<(\w+) +class="mbg">(.+?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);
    	print_r($classmbg)."\n";
    echo "		</td>
    	</tr>
    </table>";
    PHP:
    this is the result:
    qwer0230 ( 171 )
    [1] => div [2] => qwer0230 ) 
    Code (markup):
    There are a tags around these which I want to strip.
     
    gilgalbiblewheel, Jul 30, 2009 IP
  12. keaglez

    keaglez Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #12
    hmm... can you give the example of the string you want to match? it's hard to do it without actually seeing the string...
     
    keaglez, Jul 31, 2009 IP
  13. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #13
    I only need the innerHtml of the a tags. I want to eliminate the a tags.
     
    gilgalbiblewheel, Jul 31, 2009 IP
  14. keaglez

    keaglez Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #14
    nah, since you didn't give the example of string, I don't know how can I help you more, the result you copied is pretty weird... Well, you can eliminate any html tags using strip_tags function...

    I tried using this string, and it worked... I hope the string you want to match is similar to this.. :)
    
    $contents_of_page = '<span class="mbg"><a href="">content1</a></span><span class="mbg"><a href="">content2</a></span>';
    preg_match_all('#<(\w+) +class="mbg">(.+?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);
    foreach ($classmbg as $mbg)
    {
    	echo $mbg[2].": ";
    	// strip tags
    	echo strip_tags($mbg[2])."<br/>";
    }
    
    PHP:
     
    keaglez, Jul 31, 2009 IP
  15. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #15
    OK OK!!!


    2 questions:
    1 is how to grab [2] => and
    2 is I need to strip the link.
     
    Last edited: Jul 31, 2009
    gilgalbiblewheel, Jul 31, 2009 IP
  16. keaglez

    keaglez Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #16
    Oh, well, I do a little search and I think I know what you were tried to accomplish... :)

    Here is the code I just write:
    
    preg_match_all('#<(\w+) +class="mbg"> *<a[^>]*>(.+?)</a>(.*?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);
    
    // grab it
    foreach ($classmbg as $mbg)
    {
    	echo "Matched: ". $mbg[0] ."<br/>";
    	echo "The starting tags: ". $mbg[1] ."<br />";
    	echo "The string we want: ". $mbg[2] ."<br/>";
    	echo "The rest of the string: ". $mbg[3] ."<br/>";
    	// want to grab the [2] only, here...
    	$newclassmbg[] = $mbg[2];
    }
    print_r($newclassmbg);
    
    PHP:
    Hope that works... :)
     
    keaglez, Jul 31, 2009 IP
  17. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #17
    Works . But I wanted to do the same thing but this time with this:
    /*******************************username**********************************************************************************/
    preg_match_all('#<(\w+) +class="mbg"> *<a[^>]*>(.+?)</a>(.*?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);
    
    // grab it
    $i = 1;
    foreach ($classmbg as $mbg)
    {
        //echo "Matched: ". $mbg[0] ."<br/>";
        //echo "The starting tags: ". $mbg[1] ."<br />";
        echo "<span style=\"font-weight: bold;\">".$i." Username:</span> ". $mbg[2] ."<br/>";
        //echo "The rest of the string: ". $mbg[3] ."<br/>";
        // want to grab the [2] only, here...
        //$newclassmbg[] = $mbg[2];
    	$i++;
    }
    print_r($newclassmbg);
    /*******************************item name**********************************************************************************/
    preg_match_all('#<(\w+) +class="g-asm"> *<a[^>]*>(.+?)</a>(.*?)</\1>#is', $contents_of_page, $classgasm, PREG_SET_ORDER);
    
    // grab it
    $i = 1;
    foreach ($classgasm as $gasm)
    {
        //echo "Matched: ". $gasm[0] ."<br/>";
        //echo "The starting tags: ". $gasm[1] ."<br />";
        echo "<span style=\"font-weight: bold;\">".$i." Item name:</span> ". $gasm[2] ."<br/>";
        //echo "The rest of the string: ". $gasm[3] ."<br/>";
        // want to grab the [2] only, here...
        //$newclassmbg[] = $gasm[2];
    	$i++;
    }
    print_r($newclassgasm);
    PHP:
    class=g-asm but it's returning void. Actually the class name is in the a tag:
    <a class="g-asm"
     
    gilgalbiblewheel, Jul 31, 2009 IP
  18. keaglez

    keaglez Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #18
    It would be simpler, try this one:
    
    preg_match_all('#<a[^>]*class="g-asm"[^>]*>(.+?)</a>#is', $contents_of_page, $classgasm, PREG_SET_ORDER);
    foreach ($classgasm as $gasm)
    {
        echo "Grab: ". $gasm[1] ."<br/>";
    }
    
    PHP:
    Hope that works.. ;)
     
    keaglez, Aug 1, 2009 IP
  19. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #19
    Right on ...and how about a href of the a tag? How can I strip off everything except the href?

    How can I learn all these regexps?
     
    gilgalbiblewheel, Aug 1, 2009 IP
  20. keaglez

    keaglez Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #20
    There is a lot of source to learn regex in Google, try some search... :)

    Here I give some explanation on previous regex,
    
    #<a[^>]*class="g-asm"[^>]*>(.+?)</a>#is
    
    Code (markup):
    The both # is for determine the starting and ending of the regex, is in the end is to tell it to match not case sensitively and to treat it as single line, so it will include newline to match. There is some case where people write many newline in html and so this is important.

    <a[^>]*class="g-asm"[^>]*> this match <a, the [^>]* will match any character excepts >, so it make sure we don't get out of the scope we want. class="g-asm" simply match as is. And we end it with >, so this regex will match the entire <a...>.

    (.+?), this will group and match any character, the ? is to tell that it want to match as less as possible until the next expression found. When we group it, we will able to get it values later using \1 (or 2, 3, etc respectively) in the same expression and also in array we passed to preg_match_all. See the difference between + and *, + will match for 1 or more character, and * will match 0 or more character.

    </a> also simply match as is.

    Sorry, I'm not a good english speaker... but hope you can understand that... :)
     
    keaglez, Aug 2, 2009 IP