1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Extract Data from Web Page

Discussion in 'PHP' started by kolucoms6, Sep 5, 2010.

  1. #1
    What I am looking for a PHP applications which will :

    Extract and Display Data , based on 2 search criterias i.e Last Name and County { I may create Table of few last name and Counties in Mysql } , from "Result HTML" based on HTML Tag ( Name, Address and Phone ) in a Tabular Format.

    Repeat the above logic 4 times for Each Page { Page 1, 2 , 3 ,4 } of Resulted HTML.

    Basically , I want to Pull out Data from Different White Pages and Display them in a Tabular Format.

    Possible ??
     
    kolucoms6, Sep 5, 2010 IP
  2. Rainulf

    Rainulf Active Member

    Messages:
    373
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    85
    #2
    Yes, it's possible. :)

    If you're grabbing data from mysql database, it should be something like this:
    
    $sql = new mysqli('localhost', 'user', 'pass') or die($sql->error);
    $sql->select_db('db') or die ($sql->error);
    $result = $sql->query("SELECT * FROM tablename LIMIT 0, to watever");
    
    // draw your tabular thingy
    echo "<table>";
    while($row = $result->fetch_array( )) {
       echo "<tr><td>{$row['watever']}</td></tr>";
    }
    echo "</table>";
    
    $sql->close( );
    
    PHP:
     
    Rainulf, Sep 5, 2010 IP
  3. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #3
    Here is what I am looking for :

    I have a Link which displays some data on HTML format :

    http://www.118.com/people-search.mv...john&Location=london&pageSize=50&pageNumber=1

    Data comes in below format :

    <div class="searchResult regular">
    <h2>Bird John</h2>
    <div class="address">
    56 Leathwaite Road<br />
    London<br />
    SW11 6RS
    </div>
    <div class="telephoneNumber">
    020 7228 5576
    </div>
    </div>

    I want my PHP page to execute above URL and Extract/Parse Data from the Result HTML page based on above Tags as
    h2=Name
    address=Address
    telephoneNumber= Phone Number

    and Display them in a Tabular Format.
     
    kolucoms6, Sep 5, 2010 IP
  4. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #4
    I got this but it only shows the TEXT format of an HTML page but works to an extent:

     
    kolucoms6, Sep 6, 2010 IP
  5. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #5
    EDIT: See below...
     
    Last edited: Sep 6, 2010
    MyVodaFone, Sep 6, 2010 IP
  6. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #6
    Thanks a lot
     
    kolucoms6, Sep 6, 2010 IP
  7. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #7
    Tried this to display data in tabular format

    echo '<table border=1><tr><td>'.$name.'</td><td>'.$address.'</td><td>'.$telephoneNumber.'</td></tr></table>';

    but

    format goes heywire.
     
    kolucoms6, Sep 6, 2010 IP
  8. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #8
    Sorry my bad, there's a problem with my code, I'll post back here when I have it, but if anyone else would like to join in, please do... you all know I'm crap at this regex stuff:)
     
    MyVodaFone, Sep 6, 2010 IP
  9. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #9
    What I am looking for is :

    Display Option to select Last name and County from the DropDown.

    User can select Atmost 25 Lastnames at a time from each County.

    Once click on Submit, It will execute the Loop of Last name in that particular County and Displays the Result in a Tabular format.

    Is it possible ?
     
    kolucoms6, Sep 6, 2010 IP
  10. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #10
    Ok try this instead and just replace the echo with your table. Note the changes below.

    
    <?php
    
    $url = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=john&Location=london&pageSize=50&pageNumber=1");
    
    function get_data($url)
    {
    $ch = curl_init();
    	curl_setopt($ch, CURLOPT_HEADER, 0);
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    	curl_setopt($ch, CURLOPT_URL, $url);
    	curl_setopt ($ch, CURLOPT_REFERER, 'http://www.mse360.com/about/bot.php');
     	curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
    	curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
    	$data = curl_exec($ch);
    	curl_close($ch);
    	return $data;
    }
    
    $string = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<div class="searchResult regular">
                            <h2>'.$item[1].'</h2>
                            <div class="address">
                            '.$item[2].'
                            </div>
                            <div class="telephoneNumber">
                             '.$item[3].'
                            </div>
                        </div>
    ';
    }
    ?>
    
    PHP:
    With regards to your search criteria, I guess that will be part of your search url

    PS: the address out put, contains <br tags to remove those use strip_tags($item[2])
     
    MyVodaFone, Sep 6, 2010 IP
  11. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #11
    Trying this :

    $arr = array(1, 2, 3, 4);
    foreach ($arr as &$value) {
    $url = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=john&Location=london&pageSize=50&pageNumber=" . $arr);

    But looks like there is an Error as page displays nothing !!!


    Also, echo '<table border=1><tr><td>'.$item[1].'</td><td>'.$item[2].'</td><td>'.$item[3].'</td></tr></table>';
     
    kolucoms6, Sep 6, 2010 IP
  12. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #12

    Whats wrong with what I gave you...
    
    <?php
    
    $url = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=john&Location=london&pageSize=50&pageNumber=1");
    
    function get_data($url)
    {
    $ch = curl_init();
    	curl_setopt($ch, CURLOPT_HEADER, 0);
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //Set curl to return the data instead of printing it to the browser.
    	curl_setopt($ch, CURLOPT_URL, $url);
    	curl_setopt ($ch, CURLOPT_REFERER, 'http://www.mse360.com/about/bot.php');
     	curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
    	curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
    	$data = curl_exec($ch);
    	curl_close($ch);
    	return $data;
    }
    
    $string = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.$item[2].'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    ?>
    
    PHP:
    Your table just needs fixing up.
     
    MyVodaFone, Sep 6, 2010 IP
  13. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #13
    Table thing worked perfectly.

    Now, above url is for Page 1, I want to loop through other 3 pages also. I.e 1 - 4 pages.
     
    kolucoms6, Sep 6, 2010 IP
  14. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #14
    You can create your own search form using the following variables:

    
    $name= $_POST['name'];
    $location=$_POST['location'];
    $pageSize=$_POST['pageSize'];
    $pageNumber=$_POST['pageNumber'];
    
    PHP:
    Then you build your $url Name=$name&Location=$location&pageSize=$pageSize&pageNumber=$pageNumber
     
    MyVodaFone, Sep 6, 2010 IP
  15. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #15
    kolucoms6, Sep 6, 2010 IP
  16. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #16
    As far as I can tell, you can do something like 5,10,15 etc.. to 50, your probable best to set that at a permanent 50 and just use the page numbers
     
    MyVodaFone, Sep 6, 2010 IP
  17. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #17
    What the Problem with below code ::


    
    
    <html>
    <body>
    
    <link rel=stylesheet type="text/css" href="CSS/default0.css">
    
    <form action="<?php echo $_SERVER["PHP_SELF"]; ?>" method="post">
    
    Last Name 1 : <input type=text name=name><br>
    
    Location : <input type=text name=location><br><br>
    
    <input type="submit" value="submit" name="submit">
    </form>
    </body>
    </html>
    
    <?php
    
    $name= $_POST['name'];
    $location=$_POST['location'];
    
    ?>
    
    <?php
    
    $url1 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name&Location=$london&pageSize=50&pageNumber=1");
    $url2 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name&Location=$london&pageSize=50&pageNumber=2");
    $url3 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name&Location=$london&pageSize=50&pageNumber=3");
    $url4 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name&Location=$london&pageSize=50&pageNumber=4");
    
    function get_data($url)
    {
    $ch = curl_init();
        curl_setopt($ch, CURLOPT_HEADER, 0);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //Set curl to return the data instead of printing it to the browser.
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt ($ch, CURLOPT_REFERER, 'http://www.mse360.com/about/bot.php');
        curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
        curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
        $data = curl_exec($ch);
        curl_close($ch);
        return $data;
    }
    
    $string1 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url1, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.$item[2].'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string2= preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url2, $matches, PREG_SET_ORDER);
    
    foreach ($matches as $item) {
    
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.$item[2].'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string3= preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url3, $matches, PREG_SET_ORDER);
    
    foreach ($matches as $item) {
    
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.$item[2].'</td><td>'.$item[3].'</td></tr></table>
    ';
    
    }
    $string4= preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url4, $matches, PREG_SET_ORDER);
    
    foreach ($matches as $item) {
    
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.$item[2].'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    ?>
    
    
    Code (markup):

    When I click on submit, it takes me back the normal page instead of the result.
     
    Last edited: Sep 6, 2010
    kolucoms6, Sep 6, 2010 IP
  18. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #18
    
    <html>
    <form action="<?php echo $_SERVER["PHP_SELF"]; ?>" method="post">
    
    Last Name 1 : <input type=text name=name><br>
    
    Location : <input type=text name=location><br><br>
    
    <input type="submit" value="submit" name="submit">
    </form>
    </html>
    
    <?php
    $name= $_POST['name'];
    $location=$_POST['location'];
    $url = "http://www.118.com/people-search.mvc?Supplied=true&Name=$name&Location=$location&pageSize=50&pageNumber=1";
    
    if(isset($url)){
    $url = get_data($url);
    }
    
    function get_data($url)
    {
    $ch = curl_init();
    	curl_setopt($ch, CURLOPT_HEADER, 0);
    	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //Set curl to return the data instead of printing it to the browser.
    	curl_setopt($ch, CURLOPT_URL, $url);
    	curl_setopt ($ch, CURLOPT_REFERER, 'http://www.mse360.com/about/bot.php');
     	curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
    	curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
    	$data = curl_exec($ch);
    	curl_close($ch);
    	return $data;
    }
    
    $string = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.$item[2].'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    ?>
    PHP:
    If you need anything more complex, try the programming section.
     
    MyVodaFone, Sep 6, 2010 IP
  19. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #19
    Thanks for the Code but where did I make the Mistake so that I shld Learn ?
     
    kolucoms6, Sep 6, 2010 IP
  20. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #20
    Address has 3 Lines.

    52 Earls Mill Road
    Plymouth
    PL7 2BX

    How to divide it in 3 Different Column ?
     
    kolucoms6, Sep 6, 2010 IP