Need help with preg_mach_all

Discussion in 'PHP' started by coladeu, May 3, 2012.

  1. #1
    I need help with this: I am trying to extract some data from an html page. For example i need to extract from this code the following data.

    1 File Name
    2 File Size
    3 File Url ( At the url extraction i need to add before "/1186111/file 1.zip/" my site name to look like this "href='http://mysite.com/1186111/file 1.zip/')

    The code example is this I am not good in preg_match function so if is possible to help me someone with this, someway it will be nice.
    Thanks.

    <div align=left>
    <table width=\"100%\">
    <tr height=110><td valign=top>1.</td><td valign=middle align=center><img src="/images/file.jpg" width="70" height="70" border="0"></td>
        <td valign=top><a href='/1186111/file 1.zip/' target='_blank'>File Name 1</a></td>
    </tr>
    <tr><td></td><td colspan=2>Downloads: 722 &nbsp; Size: 2.30 MB</td></tr>
    <tr><td colspan=3><hr size=1></td></tr>
    
    <tr height=110><td valign=top>2.</td><td valign=middle align=center><img src="/images/file.jpg" width="70" height="70" border="0"></td>
        <td valign=top><a href='/2721907/file 2.zip/' target='_blank'>File Name 2</a></td>
    </tr>
    <tr><td></td><td colspan=2>Downloads: 193 &nbsp; Size: 11.61 MB</td></tr>
    <tr><td colspan=3><hr size=1></td></tr>
    
    <tr height=110><td valign=top>3.</td><td valign=middle align=center><img src="/images/file.jpg" width="70" height="70" border="0"></td>
        <td valign=top><a href='/2721887/file 3.zip/' target='_blank'>File Name 3</a></td>
    </tr>
    <tr><td></td><td colspan=2>Downloads: 189 &nbsp; Size: 10.99 MB</td></tr>
    <tr><td colspan=3><hr size=1></td></tr>
    
    <tr height=110><td valign=top>4.</td><td valign=middle align=center><img src="/images/file.jpg" width="70" height="70" border="0"></td>
        <td valign=top><a href='/2721902/file 4.zip/' target='_blank'>File Name4</a></td>
    </tr>
    <tr><td></td><td colspan=2>Downloads: 189 &nbsp; Size: 8.46 MB</td></tr>
    <tr><td colspan=3><hr size=1></td></tr>
    
    
    </table>
    </div>
    
    Code (markup):
    I tried to make it some way with this code but i dont get any results and i really don't know how to make it right.

    $html = file_get_contents("files/page1.html")
    
    $match1 = preg_match_all("#<div align=left>.*?<table width=\"100%\">(.*?)</table>.*?<a.*?href=\"(.*?)\">(.*?)</a>*.?</div>#is", $html, $matches);
    
    $links = array();
    
    $count = count($matches[1]);
    
    for($i = 0; $i < $count; $i++) {
        $links[] = array($matches[1][$i], $matches[2][$i]);
    }
    
    print_r($links);
    
    Code (markup):
     
    Solved! View solution.
    coladeu, May 3, 2012 IP
  2. Oli3L

    Oli3L Active Member

    Messages:
    207
    Likes Received:
    3
    Best Answers:
    1
    Trophy Points:
    70
    #2
    
    // $text -> file_get_contents...
    preg_match_all("#<tr height=110>.*?<a href='(.*?)'*>(.*?)<\/a>.*?Size: (.*?) MB*</td></tr>#is", $text, $matches);
    
    unset($matches[0]);
    
    $links = array();
    for($i = 0; $i++ < count($matches);) {
    	// $links = array(file name, file url, file size);
    	$links[] = array($matches[2][$i], $matches[1][$i], $matches[3][$i]);
    }
    
    PHP:
    hope it helped.
     
    Oli3L, May 3, 2012 IP
  3. coladeu

    coladeu Peon

    Messages:
    17
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Thanks again for helping me Oli3l i put this code and i get this error Parse error: syntax error, unexpected T_STRING in /home/user5/public_html/test.php on line 3 ​and i dont see what is exactly the cause
    <?
    $text -> file_get_contents("test.html")
    preg_match_all("#<tr height=110>.*?<a href='(.*?)'*>(.*?)<\/a>.*?Size: (.*?) MB*</td></tr>#is", $text, $matches);


    unset($matches[0]);


    $links = array();
    for($i = 0; $i++ < count($matches);) {
    // $links = array(file name, file url, file size);
    $links[] = array($matches[2][$i], $matches[1][$i], $matches[3][$i]);
    }
    print_r($links);
    ?>
     
    coladeu, May 3, 2012 IP
  4. #4
    FIXED:
    
    <?
    $text=file_get_contents("test.html");
    preg_match_all("#<tr height=110>.*?<a href='(.*?)'.*?>(.*?)<\/a>.*?Size: (.*?) MB*</td></tr>#is", $text, $matches);
    
    unset($matches[0]);
    
    $links = array();
    for($i = -1; $i++ < count($matches) {
    $links[] = array($matches[2][$i], $matches[1][$i], $matches[3][$i]);
    }
    print_r($links);
    ?>
    
    PHP:
     
    Last edited: May 3, 2012
    Oli3L, May 3, 2012 IP
  5. coladeu

    coladeu Peon

    Messages:
    17
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Thanks for your help Oli3L. Is working. Have a nice day
     
    coladeu, May 3, 2012 IP
  6. koolumair

    koolumair Peon

    Messages:
    3
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    hi,i have installed sitesift directory on my website but i am getting an error "Deprecated: Function eregi_replace() is deprecated in/home/umairpk/public_html/directory/include/myfunctions.ini.phpon line 31" can anyone help me to fix the error? i have installed the directory script here " umairpk.com/directory" thanx
     
    koolumair, May 3, 2012 IP