Problem with preg_match Div Content need help

Discussion in 'PHP' started by coladeu, Apr 27, 2012.

  1. #1
    Hi there. I am not to good in php what i want to do is to make a script that can extract Titles with URls from my content. I searched on internet and i fount all kind of scripts but i cant figure it out with this one.

    This is the html page that i want to extract the The names and the urls.
    <div style="border:5px solid #fffff;">        
            <div class="row1" id="arhive_html">
                <div id="right_file">
                    <div style="font-size:10px;"><b>Site 1 Name</b></div>
                        <div style="float:right;"><a style="color:red;" target="_blank" rel="nofollow" href="http://mirror1.com/arhive.zip">Download</a></div> 
                </div>
            </div>
            
            <div class="row2" id="arhive_html">
                <div id="right_file">
                    <div style="font-size:10px;"><b>Site 2 Name</b></div>
                        <div style="float:right;"><a style="color:red;" target="_blank" rel="nofollow" href="http://mirror2.com/arhive.zip">Download</a></div> 
                </div>
            </div>
            
            <div class="row3" id="arhive_html">
                <div id="right_file">
                    <div style="font-size:10px;"><b>Site 3 Name</b></div>
                        <div style="float:right;"><a style="color:red;" target="_blank" rel="nofollow" href="http://mirror3.com/arhive.zip">Download</a></div> 
                </div>
            </div>
            
    </div>
    Code (markup):
    And this is my code witch was worked before but now i have made some changes with div's and i need to convert this to this type.
    <?php
    $ser = urlencode(strip_tags(str_replace("-", " ", $_GET['search'])));
    $p = strip_tags($_GET['page']);
    $count = get_source_count(basename(__FILE__));
    $max = 20;
    
    
    if (($count > $max) or empty($count))
        $count = $max;
    
    
    $data = openpage("content.html");
    preg_match("/<div id=\"arhive_htm\">(.*?)<\/div>/ismU", $data, $res);
    
    
    $i = 0;
    while ($i++ < $count) {
        $arhive = preg_split("/<tr .*>/ismU", $res[1]);
        $arhive = explode("</tr>", $arhive[$i]);
        $arhive = $arhive[0];
        preg_match("/td class=\"data\">.*(http:\/\/(.*).zip);/ism", $arhive, $geturl);
        preg_match("/class=\"s_name\">(.*?)<\/a>/", $arhive, $getname);
        $name = htmlentities(strip_tags($getname[1]));
        $lname = str_replace(" ", "_", $name) . ".zip";
        $url = urldecode($geturl[1]);
        $url2 = enc($url);
        $playtime = NULL;
    
    
        if ($name && $url) {
            include 'content/list.php';
        }
    }
    ?>
    Code (markup):
    I hope someone can help me with this. Wait for some replys
     
    coladeu, Apr 27, 2012 IP
  2. Oli3L

    Oli3L Active Member

    Messages:
    207
    Likes Received:
    3
    Best Answers:
    1
    Trophy Points:
    70
    #2
    are you trying to fetch the link of:
    <a style="color:red;" target="_blank" rel="nofollow" href="http://mirror3.com/arhive.zip">Download</a>

    ?
     
    Oli3L, Apr 27, 2012 IP
  3. coladeu

    coladeu Peon

    Messages:
    17
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    yes .. the link and the title name
    <div style="font-size:10px;"><b>Site 1 Name</b></div> not necesary with the style. Is possible
     
    coladeu, Apr 28, 2012 IP
  4. Oli3L

    Oli3L Active Member

    Messages:
    207
    Likes Received:
    3
    Best Answers:
    1
    Trophy Points:
    70
    #4
    hmm.. I played with it and this is what i got
    
    $match1 = preg_match_all("#<div id=\"right_file\">.*?<div style=\"font-size:10px;\"><b>(.*?)</b></div>.*?<a.*?href=\"(.*?)\">(.*?)</a>*.?</div>#is", $text, $matches);
    
    $links = array();
    
    $count = count($matches[1]);
    
    for($i = 0; $i < $count; $i++) {
    	$links[] = array($matches[1][$i], $matches[2][$i]);
    }
    
    print_r($links);
    
    
    PHP:
    $text is the page contents.
     
    Oli3L, Apr 28, 2012 IP
  5. coladeu

    coladeu Peon

    Messages:
    17
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Is working this results is what i need
    I have this script for my site wich is made to extract urls and titles from a specific site and fetch them on my site with the name and the url. Now i try to convert the script to extract from another site with this method and the same variables
    the source is this

    <?php
    
    $data = ("http/www.mp3site.com/page1.html");
    preg_match_all("#<div id=\"right_file\">.*?<div style=\"font-size:10px;\"><b>(.*?)</b></div>.*?<a.*?href=\"(.*?)\">(.*?)</a>*.</div>#is", $data, $result);
    
    
    $count = get_source_count(basename(__FILE__));
    $max = 25;
    
    
    if (($count > $max) or empty($count))
        $count = $max;
    
    
    $i = 0;
    while ($i++ < $count) {
        $song = explode("{", $data);
        $song = explode("},", $song[$i]);
        $song = isset($song[0]) ? $song[0] : NULL;
    
    
        preg_match('<div style=\"font-size:15px;\">', $song, $getname);
        preg_match("/a class=\"button down_button\" .* href=\"(.*?)\" target=\"_blank\" rel=\"nofollow\">/", $song, $geturl);
    
    
        $name = trim($getname[1]);
        $lname = str_replace(" ", "_", str_replace(array('&amp;', '&', '#'), '', $name)) . ".mp3";
        
        $url = strip_tags(urldecode($geturl[1]));
        $url = (substr($url, -3) == "mp3") ? $url : $url . "#.mp3";
        $url2 = enc($url);
        $playtime = "Unknown";
        $source = "Dilandau";
        
        $listen = 'name=' . $lname . '&url=' . $url2;
        $download = 'name=' . $lname . '&url=' . $url2 . "&mode=redirect";
            
        if ($name && $url) {
            include 'source/list.php';
        }
    }
    ?>
    Code (markup):
    I tryed to fix it my self with the code you make but i dont know exactly how to match the variables. If you can help me with this i will appreciate Thanks for support
     
    coladeu, Apr 28, 2012 IP