Log in or Sign up

How to extract varibles from HTML tables (using PHP)

Discussion in 'PHP' started by goldensea80, Jan 30, 2007.

goldensea80 Well-Known Member

Messages:

422

Likes Received:

10

Best Answers:

0

Trophy Points:

128

#1

The problem is that I read a HTML file in PHP, the main content of HTML file is a table with fields as column and rows as records. How can I extract the values to a variable, say, an associate array?
I figure out that I must trip out all the HTML tags except tag TR and TD, then use some Regular Expession but haven't found the final solution yet.

goldensea80, Jan 30, 2007 IP
picouli Peon

Messages:

760

Likes Received:

89

Best Answers:

0

Trophy Points:

0
#2
yes you can do a

$text = strip_tags($html, '<td>');

PHP:

and then regexp with something like this:

$pattern = '/<td>([^<]+)<\/td>/'; preg_match_all($pattern, $text, $matches, PREG_SET_ORDER);

PHP:

You should end up with all your matches in $matches[0]

http://php.net/strip_tags
http://php.net/preg_match_all

HTH, cheers!
picouli, Jan 30, 2007 IP

goldensea80 likes this.
rays Active Member

Messages:

563

Likes Received:

7

Best Answers:

0

Trophy Points:

58

#3

read a file in a string, then use explode function

note code is not precise

<?
$str = "contetns of file";

$tr_arr = explode("<tr>");

// write a function which will recursively remove </tr> from array

$td_arr = explode ("<td>");

// write a function which will recursively remove </td> from array

//Then you will have all the contents of a table in an array

?>
Click to expand...

rays, Jan 30, 2007 IP
goldensea80 Well-Known Member

Messages:

422

Likes Received:

10

Best Answers:

0

Trophy Points:

128
#4
Thanks guys,
I found that the picouli's method is pretty useful. However, in my cases, it works with PREG_PATTERN_ORDER or none, and the usable array is matches[2].
In fact, since the <td> tags has parameters, so we need to include them in the regexp. Further, we need to include empty cells as well, so + is replaced by *.

$pattern = '/<td([^<]*)>([^<]*)<\/td>/';

PHP:
goldensea80, Jan 30, 2007 IP

(You must log in or sign up to reply here.)