preg_match_all class

gilgalbiblewheel Well-Known Member

Messages:: 435

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 101

#1

I need to know the formula to find all where the class attribute is equal to g-pb25
There must be something wrong with this that it doesn't work:
preg_match_all('|CLASS="g-pb25">|U', $contents_of_page, $classout, PREG_SET_ORDER);
PHP:

gilgalbiblewheel, Jul 29, 2009 IP

Leron Active Member

Messages:: 38

Likes Received:: 1

Best Answers:: 1

Trophy Points:: 53

#2

I'm a bit confused. What exactly are you trying to achieve here? Can you provide the rest of the code that you have?

Leron, Jul 29, 2009 IP

gilgalbiblewheel Well-Known Member

Messages:: 435

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 101

#3

Leron said: ↑

I'm a bit confused. What exactly are you trying to achieve here? Can you provide the rest of the code that you have?
Click to expand...

I need to know what's in the tag where class is g-pb25

gilgalbiblewheel, Jul 29, 2009 IP

gilgalbiblewheel Well-Known Member

Messages:: 435

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 101

#4

Ok. I have two preg_match_alls. The one that works is blocked with //. The other, for some reason, doesn't work:

	//CLASS
echo "<table style=\"border: 1px solid black;\">
	<tr>
		<td>";
	preg_match_all('#class=\"g-pb25\">(.+?)<#Ui', $contents_of_page, $classout, PREG_SET_ORDER);
	//print_r($classout)."\n";
	/*
	foreach ($classout as $val) {
    echo "matched: " . $val[0] . "\n";
    echo "part 1: " . $val[1] . "\n";
    echo "part 2: " . $val[3] . "\n";
    echo "part 3: " . $val[4] . "\n\n";
	}
	*/
echo "		</td>
	</tr>
</table>";
echo "<table style=\"border: 1px solid red;\">
	<tr>
		<td>";
	preg_match_all('#class=\"g-asm\">(.+?)<#Ui', $contents_of_page, $classgasm, PREG_SET_ORDER);
	print_r($classgasm)."\n";
foreach ($classgasm as $val) {
    echo "<span style:\"color: red;\">matched: " . $val[0] . "</span>\n";
    echo "part 1: " . $val[1] . "\n";
    echo "part 2: " . $val[3] . "\n";
    echo "part 3: " . $val[4] . "\n\n";
	}
echo "		</td>
	</tr>
</table>";

PHP:

The foreach doesn't work either.

gilgalbiblewheel, Jul 29, 2009 IP

Leron Active Member

Messages:: 38

Likes Received:: 1

Best Answers:: 1

Trophy Points:: 53

#5

Firstly I don't know what your content looks like, but add this to the top of
your script
$contents_of_page = 'this is the content 
this is the other content';
Code (markup):
both of your syntax are correct, so I think its where you are grabbing the content for your "$contents_of_page" information from.

Let me know how it turns out with the test string above

Leron, Jul 29, 2009 IP

gilgalbiblewheel Well-Known Member

Messages:: 435

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 101

#6

Leron said: ↑
Firstly I don't know what your content looks like, but add this to the top of
your script
$contents_of_page = 'this is the content 
this is the other content';
Code (markup):
both of your syntax are correct, so I think its where you are grabbing the content for your "$contents_of_page" information from.

Let me know how it turns out with the test string above
Click to expand...
	$contents_of_page = 'this is the contentthis is the other content';

echo "<table style=\"border: 1px solid black;\">
	<tr>
		<td>";

	preg_match_all('#class=\"mbg\">(.+?)<#Ui', $contents_of_page, $classmbg, PREG_SET_ORDER);
	print_r($classmbg)."\n\n";
echo "		</td>
	</tr>
</table>";
echo "<table style=\"border: 1px solid red;\">
	<tr>
		<td>";
	preg_match_all('#class=\"g-asm\">(.+?)<#Ui', $contents_of_page, $classgasm, PREG_SET_ORDER);
	print_r($classgasm)."\n";

	echo "		</td>
	</tr>
</table>";
PHP:
Array ( [0] => Array ( [0] => CLASS="mbg">this is the contentthis is the other content< [1] => this is the contentthis is the other content ) )
Array ( [0] => Array ( [0] => CLASS="g-asm">this is the other content< [1] => this is the other content ) )
Click to expand...

It seems to be ignoring the selected tags.It's looking for the contents of every tag.

gilgalbiblewheel, Jul 29, 2009 IP

jamespv85 Peon

Messages:: 238

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 0

#7

This would work too:

CLASS="g-pb25">(.+)<\/

the one in (.*) should get you what's inside CLASS="g-pb25", although this will not span multiple lines

jamespv85, Jul 29, 2009 IP

Leron Active Member

Messages:: 38

Likes Received:: 1

Best Answers:: 1

Trophy Points:: 53

#8

gilgalbiblewheel said: ↑

It seems to be ignoring the selected tags.It's looking for the contents of every tag.
Click to expand...

Get rid of the U modifier you used. I think you meant to use the lowercase u modifier for UTF-8 compatibility, but if not just remove the ? from inside your grouping
So change:
preg_match_all('#class="mbg">(.+?)<#Ui', $contents_of_page, $classmbg, PREG_SET_ORDER);

To:
preg_match_all('#class="mbg">(.+?)<#ui', $contents_of_page, $classmbg, PREG_SET_ORDER);

or this:
preg_match_all('#class="mbg">(.+)<#Ui', $contents_of_page, $classmbg, PREG_SET_ORDER);

and this:
preg_match_all('#class=\"g-asm\">(.+?)<#Ui', $contents_of_page, $classgasm, PREG_SET_ORDER);

to this:
preg_match_all('#class=\"g-asm\">(.+?)<#ui', $contents_of_page, $classgasm, PREG_SET_ORDER);

or this:
preg_match_all('#class=\"g-asm\">(.+)<#Ui', $contents_of_page, $classgasm, PREG_SET_ORDER);
Code (markup):
Just a matter of being greedy I guess

Leron, Jul 30, 2009 IP

keaglez Peon

Messages:: 33

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#9


preg_match_all('#<(\w+) +class="mbg">(.+?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);

PHP:

Try this...

keaglez, Jul 30, 2009 IP

gilgalbiblewheel Well-Known Member

Messages:: 435

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 101

#10

I wonder why it's doing in two arrays (one array within another):

Array
(
    [0] =&gt; Array
        (
            [0] =&gt; class="mbg"&gt;<a href="http://.../wrath_of_the_lamb_rev616">wrath_of_the_lamb_rev616&lt;

            [1] =&gt; </a><a href="http://.../wrath_of_the_lamb_rev616">wrath_of_the_lamb_rev616
        )

    [1] =&gt; Array
        (
            [0] =&gt; class="mbg"&gt;</a><a href="http://.../80s_toyz/">80s_toyz&lt;
            [1] =&gt; </a><a href="http://.../80s_toyz/">80s_toyz
        )

    [2] =&gt; Array
        (
            [0] =&gt; class="mbg"&gt;</a><a href="http://.../80s_toyz/">80s_toyz&lt;

            [1] =&gt; </a><a href="http://.../80s_toyz/">80s_toyz
        )

    [3] =&gt; Array
        (
            [0] =&gt; class="mbg"&gt;</a><a href="http://.../80s_toyz/">80s_toyz&lt;
            [1] =&gt; </a><a href="http://.../80s_toyz/">80s_toyz
        )

Code (markup):

From this php code:

	$contents_of_page = file_get_contents($booklink);
	//$contents_of_page = '<span class="mbg">this is the content</span><span class="g-asm">This is the other content</span>';

echo "<table style=\"border: 1px solid black;\">
	<tr>
		<td>";

	preg_match_all('#class=\"mbg\">(.+?)<#i', $contents_of_page, $classmbg, PREG_SET_ORDER);
	print_r($classmbg)."\n";
echo "		</td>
	</tr>
</table>";
echo "<table style=\"border: 1px solid red;\">
	<tr>
		<td>";
	preg_match_all('#class=\"g-asm\">(.+?)<#i', $contents_of_page, $classgasm, PREG_SET_ORDER);
	print_r($classgasm)."\n";

	echo "		</td>
	</tr>
</table>";

PHP:

gilgalbiblewheel, Jul 30, 2009 IP

gilgalbiblewheel Well-Known Member

Messages:: 435

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 101

#11

yours worked out:

	$contents_of_page = file_get_contents($booklink);
	//$contents_of_page = '<span class="mbg">this is the content</span><span class="g-asm">This is the other content</span>';

echo "<table style=\"border: 1px solid black;\">
	<tr>
		<td>";
preg_match_all('#<(\w+) +class="mbg">(.+?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);
	print_r($classmbg)."\n";
echo "		</td>
	</tr>
</table>";

PHP:

this is the result:

qwer0230 ( 171 )
[1] => div [2] => qwer0230 )

Code (markup):

There are a tags around these which I want to strip.

gilgalbiblewheel, Jul 30, 2009 IP

keaglez Peon

Messages:: 33

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#12

hmm... can you give the example of the string you want to match? it's hard to do it without actually seeing the string...

keaglez, Jul 31, 2009 IP

gilgalbiblewheel Well-Known Member

Messages:: 435

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 101

#13

keaglez said: ↑

hmm... can you give the example of the string you want to match? it's hard to do it without actually seeing the string...
Click to expand...

I only need the innerHtml of the a tags. I want to eliminate the a tags.

gilgalbiblewheel, Jul 31, 2009 IP

keaglez Peon

Messages:: 33

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#14

nah, since you didn't give the example of string, I don't know how can I help you more, the result you copied is pretty weird... Well, you can eliminate any html tags using strip_tags function...

I tried using this string, and it worked... I hope the string you want to match is similar to this..
$contents_of_page = '<a href="">content1</a><a href="">content2</a>';
preg_match_all('#<(\w+) +class="mbg">(.+?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);
foreach ($classmbg as $mbg)
{
	echo $mbg[2].": ";
	// strip tags
	echo strip_tags($mbg[2])." ";
}
PHP:

keaglez, Jul 31, 2009 IP

gilgalbiblewheel Well-Known Member

Messages:: 435

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 101

#15

keaglez said: ↑
nah, since you didn't give the example of string, I don't know how can I help you more, the result you copied is pretty weird... Well, you can eliminate any html tags using strip_tags function...

I tried using this string, and it worked... I hope the string you want to match is similar to this..
$contents_of_page = '<a href="">content1</a><a href="">content2</a>';
preg_match_all('#<(\w+) +class="mbg">(.+?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);
foreach ($classmbg as $mbg)
{
	echo $mbg[2].": ";
	// strip tags
	echo strip_tags($mbg[2])." ";
}
PHP:
Click to expand...
OK OK!!!

2 questions:
1 is how to grab [2] => and
2 is I need to strip the link.

Last edited: Jul 31, 2009

gilgalbiblewheel, Jul 31, 2009 IP

keaglez Peon

Messages:: 33

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#16

Oh, well, I do a little search and I think I know what you were tried to accomplish...

Here is the code I just write:


preg_match_all('#<(\w+) +class="mbg"> *<a[^>]*>(.+?)</a>(.*?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);

// grab it
foreach ($classmbg as $mbg)
{
	echo "Matched: ". $mbg[0] ."<br/>";
	echo "The starting tags: ". $mbg[1] ."<br />";
	echo "The string we want: ". $mbg[2] ."<br/>";
	echo "The rest of the string: ". $mbg[3] ."<br/>";
	// want to grab the [2] only, here...
	$newclassmbg[] = $mbg[2];
}
print_r($newclassmbg);

PHP:

Hope that works...

keaglez, Jul 31, 2009 IP

gilgalbiblewheel Well-Known Member

Messages:: 435

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 101

#17

keaglez said: ↑

Oh, well, I do a little search and I think I know what you were tried to accomplish...

Here is the code I just write:


preg_match_all('#<(\w+) +class="mbg"> *<a[^>]*>(.+?)</a>(.*?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);

// grab it
foreach ($classmbg as $mbg)
{
	echo "Matched: ". $mbg[0] ."<br/>";
	echo "The starting tags: ". $mbg[1] ."<br />";
	echo "The string we want: ". $mbg[2] ."<br/>";
	echo "The rest of the string: ". $mbg[3] ."<br/>";
	// want to grab the [2] only, here...
	$newclassmbg[] = $mbg[2];
}
print_r($newclassmbg);

PHP:

Hope that works...

Click to expand...

Works . But I wanted to do the same thing but this time with this:

/*******************************username**********************************************************************************/
preg_match_all('#<(\w+) +class="mbg"> *<a[^>]*>(.+?)</a>(.*?)</\1>#is', $contents_of_page, $classmbg, PREG_SET_ORDER);

// grab it
$i = 1;
foreach ($classmbg as $mbg)
{
    //echo "Matched: ". $mbg[0] ."<br/>";
    //echo "The starting tags: ". $mbg[1] ."<br />";
    echo "<span style=\"font-weight: bold;\">".$i." Username:</span> ". $mbg[2] ."<br/>";
    //echo "The rest of the string: ". $mbg[3] ."<br/>";
    // want to grab the [2] only, here...
    //$newclassmbg[] = $mbg[2];
	$i++;
}
print_r($newclassmbg);
/*******************************item name**********************************************************************************/
preg_match_all('#<(\w+) +class="g-asm"> *<a[^>]*>(.+?)</a>(.*?)</\1>#is', $contents_of_page, $classgasm, PREG_SET_ORDER);

// grab it
$i = 1;
foreach ($classgasm as $gasm)
{
    //echo "Matched: ". $gasm[0] ."<br/>";
    //echo "The starting tags: ". $gasm[1] ."<br />";
    echo "<span style=\"font-weight: bold;\">".$i." Item name:</span> ". $gasm[2] ."<br/>";
    //echo "The rest of the string: ". $gasm[3] ."<br/>";
    // want to grab the [2] only, here...
    //$newclassmbg[] = $gasm[2];
	$i++;
}
print_r($newclassgasm);

PHP:

class=g-asm but it's returning void. Actually the class name is in the a tag:
<a class="g-asm"

gilgalbiblewheel, Jul 31, 2009 IP

keaglez Peon

Messages:: 33

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#18

gilgalbiblewheel said: ↑

Works . But I wanted to do the same thing but this time with this:

class=g-asm but it's returning void. Actually the class name is in the a tag:
<a class="g-asm"
Click to expand...

It would be simpler, try this one:
preg_match_all('#<a[^>]*class="g-asm"[^>]*>(.+?)</a>#is', $contents_of_page, $classgasm, PREG_SET_ORDER);
foreach ($classgasm as $gasm)
{
 echo "Grab: ". $gasm[1] ." ";
}
PHP:
Hope that works..

keaglez, Aug 1, 2009 IP

gilgalbiblewheel Well-Known Member

Messages:: 435

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 101

#19

keaglez said: ↑
It would be simpler, try this one:
preg_match_all('#<a[^>]*class="g-asm"[^>]*>(.+?)</a>#is', $contents_of_page, $classgasm, PREG_SET_ORDER);
foreach ($classgasm as $gasm)
{
 echo "Grab: ". $gasm[1] ." ";
}
PHP:
Hope that works..
Click to expand...
Right on ...and how about a href of the a tag? How can I strip off everything except the href?

How can I learn all these regexps?

gilgalbiblewheel, Aug 1, 2009 IP

keaglez Peon

Messages:: 33

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#20

There is a lot of source to learn regex in Google, try some search...

Here I give some explanation on previous regex,
#<a[^>]*class="g-asm"[^>]*>(.+?)</a>#is
Code (markup):
The both # is for determine the starting and ending of the regex, is in the end is to tell it to match not case sensitively and to treat it as single line, so it will include newline to match. There is some case where people write many newline in html and so this is important.

<a[^>]*class="g-asm"[^>]*> this match <a, the [^>]* will match any character excepts >, so it make sure we don't get out of the scope we want. class="g-asm" simply match as is. And we end it with >, so this regex will match the entire <a...>.

(.+?), this will group and match any character, the ? is to tell that it want to match as less as possible until the next expression found. When we group it, we will able to get it values later using \1 (or 2, 3, etc respectively) in the same expression and also in array we passed to preg_match_all. See the difference between + and *, + will match for 1 or more character, and * will match 0 or more character.

</a> also simply match as is.

Sorry, I'm not a good english speaker... but hope you can understand that...

keaglez, Aug 2, 2009 IP

Log in or Sign up

preg_match_all class

gilgalbiblewheel Well-Known Member

Leron Active Member

gilgalbiblewheel Well-Known Member

gilgalbiblewheel Well-Known Member

Leron Active Member

gilgalbiblewheel Well-Known Member

jamespv85 Peon

Leron Active Member

keaglez Peon

gilgalbiblewheel Well-Known Member

gilgalbiblewheel Well-Known Member

keaglez Peon

gilgalbiblewheel Well-Known Member

keaglez Peon

gilgalbiblewheel Well-Known Member

keaglez Peon

gilgalbiblewheel Well-Known Member

keaglez Peon

gilgalbiblewheel Well-Known Member

keaglez Peon

Useful Searches