tricky: parse entire webpage into an array?

Discussion in 'PHP' started by falcondriver, Dec 26, 2007.

  1. #1
    hiya,

    i have an old webpage here, 20+ pages full of tables with products, description and prices.
    i have to write this values into a database for a redesign of this site. can anyone think of a way to parse a whole page into an array? it would be much easier to look for the table inside an array insteaf of messing arround with strlen() and str_pos() to read this table.
    the array should look like this so that i can access each section of the page:
    
    $site = array("html_0"=>array("body_0"=>array("h1_0"=>"heading 1 here",
                                                  "p_0"=>"first paragraph",
                                                  "p_1"=>"second paragraph",
                                                  "table_0"=>array("tr_0"=>array("td_0"=>"item name",
                                                                                 "td_1"=>"description",
                                                                                 "td_2"=>"price",)
                                                                   )
                                                  ),
                                    
                                    )
                                    
                    );
    echo "<pre>";
    print_r($site);
    echo "</pre>";
    
    PHP:
    anyone who can help?
     
    falcondriver, Dec 26, 2007 IP
  2. buldozerceto

    buldozerceto Active Member

    Messages:
    1,137
    Likes Received:
    43
    Best Answers:
    0
    Trophy Points:
    88
    #2
    buldozerceto, Dec 26, 2007 IP
  3. falcondriver

    falcondriver Well-Known Member

    Messages:
    963
    Likes Received:
    47
    Best Answers:
    0
    Trophy Points:
    145
    #3
    the main problem here is that my regex-skills suck...
     
    falcondriver, Dec 26, 2007 IP
  4. jronmo

    jronmo Guest

    Messages:
    23
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Go with XSLT. PHP has a parser to put in XML, but you may need to run Tidy first.
     
    jronmo, Dec 27, 2007 IP