regex preg_match_all question

Discussion in 'PHP' started by gilgalbiblewheel, Oct 15, 2009.

  1. #1
    I was working on a method of scraping my own file into another page:
    $contents_of_page = file_get_contents('bible.html');
    
    function getTextBetweenTags($string, $tagname) {
        $pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
        preg_match_all($pattern, $string, $matches);
        return $matches[1];
    }
    
    $str = '<table><tr><td>1</td><td>1</td><td>1</td><td>gn</td><td>Genesis</td><td>1</td><td>1</td><td>1</td><td>1</td><td>In the beginning God created the heaven and the earth.</td></tr></table>';
    $txt = getTextBetweenTags($str, "td");
    print_r($txt);
    PHP:
    $txt brings only In the beginning God created the heaven and the earth.

    What I want is to replace:
    $txt = '<table><tr><td>1</td><td>1</td><td>1</td><td>gn</td><td>Genesis</td><td>1</td><td>1</td><td>1</td><td>1</td><td>In the beginning God created the heaven and the earth.</td></tr></table>';
    PHP:
    into
    $txt = '<table><tr><td class="book">1</td><td class="chapter">1</td><td class="verse">1</td><td class="recordType">gn</td><td class="book_title">Genesis</td><td class="book_spoke">1</td><td class="chapter_spoke">1</td><td class="verse_spoke">1</td><td class="something_else">1</td><td>In the beginning God created the heaven and the earth.</td></tr></table>';
    PHP:
     
    gilgalbiblewheel, Oct 15, 2009 IP
  2. almondj

    almondj Peon

    Messages:
    768
    Likes Received:
    11
    Best Answers:
    1
    Trophy Points:
    0
    #2
    This may be what you're looking for:

    
    <?php
    
    foreach($matches as $match) {
    $match = preg_replace("/REGEX/", "<table>etc.</table>", $match)
    }
    
    print_r($matches);
    
    ?>
    
    PHP:
     
    almondj, Oct 15, 2009 IP
  3. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #3
    I was working on it and got this so far:
    $string = "<table><tr><td>1</td><td>1</td><td>1</td><td>gn</td><td>Genesis</td><td>1</td><td>1</td><td>1</td><td>1</td><td>In the beginning God created the heaven and the earth.</td></tr></table>";
    $patterns[0] = "/<td/";
    $names = Array("book", "chapter", "verse", "recordType", "book_title", "book_spoke", "chapter_spoke", "verse_spoke", "something_else", "text_data");
    
    for($repNames=0; $repNames<count($names); $repNames++){
    	$replacements[$repNames] = "<td class=\"".$names[$repNames]."\"";
    
    }
    echo preg_replace($patterns, $replacements, $string)."\n";
    
    PHP:
    One thing I don't understand is why are all the classes book?
    <table>
        <tr>
            <td class="book">1</td>
            <td class="book">1</td>
            <td class="book">1</td>
            <td class="book">gn</td>
            <td class="book">Genesis</td>
            <td class="book">1</td>
            <td class="book">1</td>
            <td class="book">1</td>
            <td class="book">1</td>
            <td class="book">In the beginning God created the heaven and the earth.</td>
        </tr>
    </table>
    
    Code (markup):
    There seems something wrong with the for loop. It's only reading the first in the array.
     
    gilgalbiblewheel, Oct 15, 2009 IP
  4. gilgalbiblewheel

    gilgalbiblewheel Well-Known Member

    Messages:
    435
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #4
    I'll answer my own problem:
    $string = "<table><tr><td>1</td><td>1</td><td>1</td><td>gn</td><td>Genesis</td><td>1</td><td>1</td><td>1</td><td>1</td><td>In the beginning God created the heaven and the earth.</td></tr></table>";
    $patterns[0] = "/<td>/";
    $names = Array("book", "chapter", "verse", "recordType", "book_title", "book_spoke", "chapter_spoke", "verse_spoke", "something_else", "text_data");
    
    foreach($names as $repNames => $name){
            $patterns[] = $patterns[0];
            $replacements[] = "<td class=\"".$name."\">";
    } 
    echo preg_replace($patterns, $replacements, $string, 1)."\n";
    PHP:
    echo preg_replace($patterns, $replacements, $string, 1)."\n";
    <table>
    <tr>
    <td class="book">1</td>
    <td class="chapter">1</td>
    <td class="verse">1</td>
    <td class="recordType">gn</td>
    <td class="book_title">Genesis</td>
    <td class="book_spoke">1</td>
    <td class="chapter_spoke">1</td>
    <td class="verse_spoke">1</td>
    <td class="something_else">1</td>
    <td class="text_data">In the beginning God created the heaven and the earth.</td>
    </tr>
    </table>
    
    Code (markup):
     
    gilgalbiblewheel, Oct 16, 2009 IP