preg_match? how to read within the td tags

Discussion in 'PHP' started by gilgalbiblewheel, Jul 20, 2008.

  1. #1
    I have a table in html which I need to convert to a database table for MySQL (I use HeidiSQL).
    SO there are a few td tags and then a tr tag indicating a new record. How would I be able to insert them in the db table? DO I use preg_match? How would it be written.
    	preg_match_all('|id="(.*)">|U', $contents_of_page, $idout, PREG_SET_ORDER); 
    	preg_match_all("|<[^>]+>(.*)</[^>]+>|U", $contents_of_page, $out, PREG_SET_ORDER);
    PHP:

     
    gilgalbiblewheel, Jul 20, 2008 IP
  2. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #2
    Post your HTML table.
     
    nico_swd, Jul 20, 2008 IP
  3. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #3
    Here it is:
    <tbody><tr>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="left">gn</td>
    <td dir="ltr" align="left">Genesis</td>
    <td></td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="left">In the beginning God created the heaven and the earth.</td>
    </tr>
    <tr>
    <td dir="ltr" align="right">2</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="left">gn</td>
    <td dir="ltr" align="left">Genesis</td>
    
    <td></td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">2</td>
    <td dir="ltr" align="right">2</td>
    <td dir="ltr" align="left">And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.</td>
    </tr>
    </tbody>
    Code (markup):
     
    gilgalbiblewheel, Jul 20, 2008 IP
  4. Mozzart

    Mozzart Peon

    Messages:
    189
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Here is an example, I'm still learning regex which is why I gave it a shot
    
    <?php
    
    $contents =<<<PAGE
    <tbody><tr>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="left">gn</td>
    <td dir="ltr" align="left">Genesis</td>
    <td></td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="left">In the beginning God created the heaven and the earth.</td>
    </tr>
    <tr>
    <td dir="ltr" align="right">2</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="left">gn</td>
    <td dir="ltr" align="left">Genesis</td>
    
    <td></td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">2</td>
    <td dir="ltr" align="right">2</td>
    <td dir="ltr" align="left">And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.</td>
    </tr>
    </tbody>
    PAGE;
    
    $va = preg_match_all('/<td(.*) *>(.*)<\/td>/i', $contents, $idout); 
    
    
    echo sizeof($idout[0]);
    
    echo $idout[0][21];
    
    PHP:
    Do note that you have to take care of empty tds (just use empty() )

    Hope this helps, test with it etc etc.
     
    Mozzart, Jul 20, 2008 IP
  5. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #5
    Thanks.
    I would like to understand what this means:
    /<td(.*) *>(.*)<\/td>/i
    PHP:
     
    gilgalbiblewheel, Jul 20, 2008 IP
  6. Mozzart

    Mozzart Peon

    Messages:
    189
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #6
    . = any character
    * = allows any number of the same thing, but optional
    () are alternation class where you can set rules and subrules and so on

    \ = escapes the data

    /i = tells the engine that we are not looking for case sensitive matches *meaning i stands for insensitive*
     
    Mozzart, Jul 20, 2008 IP
  7. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #7
    Ok and / ? What is that?
     
    gilgalbiblewheel, Jul 21, 2008 IP
  8. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #8
    nico_swd, Jul 21, 2008 IP
  9. cornetofreak

    cornetofreak Peon

    Messages:
    170
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #9
    i was going to post that video :) its got to be the best for learners
     
    cornetofreak, Jul 21, 2008 IP
  10. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #10
    I tested this. It's reading the attribute of the td tag. I need it to read the innerHTML.
     
    gilgalbiblewheel, Jul 22, 2008 IP
  11. Mozzart

    Mozzart Peon

    Messages:
    189
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #11
    innerHTML as in reading the insides of <td> tags?

    I think you have to do a var_dump($contents);

    It does read the attributes, well use this, I just realized I added a class where I shouldn't
    
    <?php
    
    $contents =<<<PAGE
    <tbody><tr>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="left">gn</td>
    <td dir="ltr" align="left">Genesis</td>
    <td></td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="left">In the beginning God created the heaven and the earth.</td>
    </tr>
    <tr>
    <td dir="ltr" align="right">2</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="left">gn</td>
    <td dir="ltr" align="left">Genesis</td>
    
    <td></td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">1</td>
    <td dir="ltr" align="right">2</td>
    <td dir="ltr" align="right">2</td>
    <td dir="ltr" align="left">And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.</td>
    </tr>
    </tbody>
    PAGE;
    
    $va = preg_match_all('/<td .*>(.*)<\/td>/i', $contents, $idout); 
    
    
    echo sizeof($idout[0]);
    
    var_dump($idout);
    
    PHP:
    This should give you td insides without the attributes. Do test with it further.
     
    Mozzart, Jul 22, 2008 IP