Extract Data from Web Page

Discussion in 'PHP' started by kolucoms6, Sep 5, 2010.

  1. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #21
    Normal Data :

    88 Penrith Gdns
    Plymouth
    PL6 8XE

    A)))

    $item[2]=preg_replace ('/\r\n|\r|\n/', ' ', $item[2]);
    $item[2]=chop($item[2]);

    Tried this but didnt work

    B)))

    $item[2]=str_replace(chr(10), "AAA11", $item[2]);
    $item[2]=str_replace(chr(13), "AAA22", $item[2]);

    Returns me :

    AAA22AAA11 88 Penrith Gdns
    AAA22AAA11 Plymouth
    AAA22AAA11 PL6 8XEAAAA22AAA11

    I need

    88 Penrith Gdns Plymouth PL6 8XE
     
    Last edited: Sep 6, 2010
    kolucoms6, Sep 6, 2010 IP
  2. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #22
    If thats all you require, I mentioned that already, run strip_tags() , see a few post back...
     
    MyVodaFone, Sep 6, 2010 IP
  3. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #23
    Perfect Thanks a lot.

    When I want to do it in a TD format I have done :

    $item[2]=str_replace(chr(13), "</td><td>", $item[2]);
    $item[3]=trim($item[3]);

    and When I run above lines I get :

    [​IMG]

    When I copy paste the same in My Excel or NotePad I get below Format :


    Brett B 61 Station Road
    Plymouth
    PL2 1NH 01752 513159

    And :


    Can I put below lines under a function so that I can call function ?


    foreach ($matches as $item) {

    $item[2]=str_replace(chr(13), "</td><td>", $item[2]);
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.$item[2].'</td><td>'.$item[3].'</td></tr></table>';
     
    Last edited: Sep 7, 2010
    kolucoms6, Sep 6, 2010 IP
  4. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #24
    Instead of what I said earlier about strip_tags, use the str_replace function to replace the <br tags with <td tags.

    something like:

    str_replace('<br />', '</td><td>', $item[2]);
     
    Last edited: Sep 7, 2010
    MyVodaFone, Sep 7, 2010 IP
  5. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #25
    I tried it

    $item[2]=str_replace('</br>'. '</td><td>', $item[2]);

    and It show me :

    [​IMG]
     
    kolucoms6, Sep 7, 2010 IP
  6. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #26
    Think about it this way, for each br> tag, instead of a new line you want to close a table so the next line gets put into a <td tag.

    This is what you have already, correct..

    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.$item[2].'</td><td>'.$item[3].'</td></tr></table>
    PHP:
    Which is the content of item2 gets surrounded with a <td tag already.

    So using str_replace('</br>'. '</td><td>', $item[2]) should replace the <br tags with a closing tag and an opening tag for what was the next line.

    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    PHP:
    Works for me...
     
    Last edited: Sep 7, 2010
    MyVodaFone, Sep 7, 2010 IP
  7. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #27
    Wow.. Silly of me !!

    :) You are a Perfect Mentor !!

    Can I put below lines under a function so that I can call function ?


    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.$item[2].'</td><td>'.$item[3].'</td></tr></table>';
     
    kolucoms6, Sep 7, 2010 IP
  8. MyVodaFone

    MyVodaFone Well-Known Member

    Messages:
    1,048
    Likes Received:
    42
    Best Answers:
    10
    Trophy Points:
    195
    #28
    Post your entire script as it is now and we will take a look.
     
    MyVodaFone, Sep 7, 2010 IP
  9. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #29
    
    
    <?php
    
    $name1= $_POST['name1'];
    $name2= $_POST['name2'];
    $name3= $_POST['name3'];
    $name4= $_POST['name4'];
    $name5= $_POST['name5'];
    
    ?>
    
    <html>
    
    <link rel=stylesheet type="text/css" href="CSS/default0.css">
    
    <form action="<?php echo $_SERVER["PHP_SELF"]; ?>" method="post">
    
    
    
      Last Name 1 : <input type=text name=name1 value=<?php echo $_POST['name1']; ?>><br>
      Last Name 2 : <input type=text name=name2  value=<?php echo $_POST['name2']; ?>><br>
      Last Name 3 : <input type=text name=name3  value=<?php echo $_POST['name3']; ?> ><br>
      Last Name 4 : <input type=text name=name4  value=<?php echo $_POST['name4']; ?>><br>
      Last Name 5 : <input type=text name=name5  value=<?php echo $_POST['name5']; ?> ><br>
    
    <label for="city">City : </label><span class="field"><select id="city" name="city"><option value=""></option> 
    <optgroup label=England> 
    <option>Bath
    <option>Brighton and Hove
    <option>Canterbury
    <option>Chichester
    <option>Durham
    <option>Gloucester
    <option>Lancaster
    <option>Lichfield
    <option>City of London
    <option>Norwich
    <option>Peterborough
    <option>Preston
    <option>Salisbury
    <option>St Albans
    <option>Truro
    <option>Westminster
    <option>Worcester
    <option>Birmingham
    <option>Bristol
    <option>Carlisle
    <option>Coventry
    <option>Ely
    <option>Hereford
    <option>Leeds
    <option>Lincoln
    <option>Manchester
    <option>Nottingham
    <option>Plymouth
    <option>Ripon
    <option>Sheffield
    <option>Stoke-on-Trent
    <option>Wakefield
    <option>Winchester
    <option>York
    <option>Bradford
    <option>Cambridge
    <option>Chester
    <option>Derby
    <option>Exeter
    <option>Kingston upon Hull
    <option>Leicester
    <option>Liverpool
    <option>Newcastle upon Tyne
    <option>Oxford
    <option>Portsmouth
    <option>Salford
    <option>Southampton
    <option>Sunderland
    <option>Wells
    <option>Wolverhampton 
    </optgroup> 
    
    <optgroup label="Northern Ireland"> 
    <option>Armagh
    <option>Lisburn
    <option>Belfast
    <option>Newry
    <option>Londonderry
    </optgroup> 
    
    <optgroup label="Scotland"> 
    <option>Aberdeen
    <option>Glasgow
    <option>Dundee
    <option>Inverness
    <option>Edinburgh
    <option>Stirling
    </optgroup> 
    <optgroup label="Unitary Authorities of Wales"> 
    <option>Bangor
    <option>St Davids
    <option>Cardiff
    <option>Swansea
    <option>Newport
    </optgroup> 
    </select>
    
    <input type="submit" value="submit" name="submit">
    </form>
    </html>
    
    <?php
    
    $location=$_POST['city'];
    
    if ( $name1<>"")
    { 
    $url1 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name1&Location=$location&pageSize=50&pageNumber=1");
    $url2 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name1&Location=$location&pageSize=50&pageNumber=2");
    $url3 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name1&Location=$location&pageSize=50&pageNumber=3");
    $url4 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name1&Location=$location&pageSize=50&pageNumber=4");
    }
    
    if ( $name2<>"")
    { 
    $url5 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name2&Location=$location&pageSize=50&pageNumber=1");
    $url6 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name2&Location=$location&pageSize=50&pageNumber=2");
    $url7 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name2&Location=$location&pageSize=50&pageNumber=3");
    $url8 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name2&Location=$location&pageSize=50&pageNumber=4");
    }
    
    if ( $name3<>"")
    { 
    $url9 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name3&Location=$location&pageSize=50&pageNumber=1");
    $url10 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name3&Location=$location&pageSize=50&pageNumber=2");
    $url11 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name3&Location=$location&pageSize=50&pageNumber=3");
    $url12 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name3&Location=$location&pageSize=50&pageNumber=4");
    }
    
    if ( $name4<>"")
    { 
    $url13 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name4&Location=$location&pageSize=50&pageNumber=1");
    $url14 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name4&Location=$location&pageSize=50&pageNumber=2");
    $url15 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name4&Location=$location&pageSize=50&pageNumber=3");
    $url16 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name4&Location=$location&pageSize=50&pageNumber=4");
    }
    
    if ( $name5<>"")
    { 
    $url17 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name5&Location=$location&pageSize=50&pageNumber=1");
    $url18 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name5&Location=$location&pageSize=50&pageNumber=2");
    $url19 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name5&Location=$location&pageSize=50&pageNumber=3");
    $url20 = get_data("http://www.118.com/people-search.mvc?Supplied=true&Name=$name5&Location=$location&pageSize=50&pageNumber=4");
    }
    
    
    if(isset($url))
    
    {
    $url = get_data($url);
    }
    
    function get_data($url)
    {
    $ch = curl_init();
        curl_setopt($ch, CURLOPT_HEADER, 0);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //Set curl to return the data instead of printing it to the browser.
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt ($ch, CURLOPT_REFERER, 'http://www.mse360.com/about/bot.php');
        curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
        curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
        $data = curl_exec($ch);
        curl_close($ch);
        return $data;
    }
    
    
    $string1 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url1, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    $string2 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url2, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string3 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url3, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string4 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url4, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    
    $string5 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url5, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    $string6 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url6, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string7 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url7, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string8 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url8, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    
    
    $string9 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url9, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    $string10 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url10, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string11 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url11, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string12 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url12, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    
    
    $string13 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url13, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    $string14 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url14, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string15 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url15, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string162 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url16, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    
    
    $string17 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url17, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    $string18 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url18, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string19 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url19, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    $string20 = preg_match_all('#<h2>([^"]+)</h2>.+?<div class="address">([^"]+)</div>.+?<div class="telephoneNumber">([^"]+)</div>#is', $url20, $matches, PREG_SET_ORDER);
    foreach ($matches as $item) {
    echo '<table border=1><tr><td>'.$item[1].'</td><td>'.str_replace('<br />', '</td><td>', $item[2]).'</td><td>'.$item[3].'</td></tr></table>
    ';
    }
    
    ?>
    
    
    Code (markup):
     
    kolucoms6, Sep 7, 2010 IP
  10. vrktech

    vrktech Well-Known Member

    Messages:
    449
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    108
    #30
    Here is an easy method, I recently found

    Requires Simple HTML Dom (http://simplehtmldom.sourceforge.net/)

    Then use the code below:

    <?php
    include_once('simple_html_dom.php');
    for($i=1;$i<=4;$i++){
    	$myurl='http://www.118.com/people-search.mvc?Supplied=true&Name=john&Location=london&pageSize=50&pageNumber='.$i;
    	
    	if (function_exists('curl_init')) {
    		$ch = curl_init();
    	   curl_setopt($ch, CURLOPT_URL, $myurl);
    	   curl_setopt($ch, CURLOPT_HEADER, 0);
    	   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    	   curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0');
    	   $content = curl_exec($ch);
    	   curl_close($ch);
    	}
    	
    	$html=str_get_html($content);
    	foreach($html->find('div[class=searchResult]') as $result){
    		$resultdata[]=array(
    		'name' => $result->find('h2',0)->innertext,
    		'address' => $result->find('div[class=address]',0)->innertext,
    		'phone' => $result->find('div[class=telephoneNumber]',0)->innertext
    		);
    	}
    }
    //YOU CAN NOW DO WHATEVER YOU NEED WITH THE RESULT ARRAY
    echo "<table><tr><th>Name</th><th>Address</th><th>Phone</th></tr>";
    foreach($resultdata as $contact){
    	echo "<tr><td>".$contact[name]."</td><td>".$contact[address]."</td><td>".$contact[phone]."</td></tr>";
    }
    echo "</table>";
    ?>
    Code (markup):
     
    vrktech, Sep 7, 2010 IP
  11. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #31
    Here is my Updated Code :

    
    
    
    <?php
    
    $name1= $_POST['name1'];
    $name2= $_POST['name2'];
    $name3= $_POST['name3'];
    $name4= $_POST['name4'];
    
    ?>
    
    <html>
    
    <link rel=stylesheet type="text/css" href="CSS/default0.css">
    
    <form action="<?php echo $_SERVER["PHP_SELF"]; ?>" method="post">
    
    
    
      Last Name 1 : <input type=text name=name1 value=<?php echo $_POST['name1']; ?>><br>
      Last Name 2 : <input type=text name=name2  value=<?php echo $_POST['name2']; ?>><br>
      Last Name 3 : <input type=text name=name3  value=<?php echo $_POST['name3']; ?> ><br>
      Last Name 4 : <input type=text name=name4  value=<?php echo $_POST['name4']; ?>><br>
    
    <label for="city">City : </label><span class="field"><select id="city" name="city"><option value=""></option> 
    <optgroup label=England> 
    <option>Bath
    <option>Brighton and Hove
    <option>Canterbury
    <option>Chichester
    <option>Durham
    <option>Gloucester
    <option>Lancaster
    <option>Lichfield
    <option>London
    <option>Norwich
    <option>Peterborough
    <option>Preston
    <option>Salisbury
    <option>St Albans
    <option>Truro
    <option>Westminster
    <option>Worcester
    <option>Birmingham
    <option>Bristol
    <option>Carlisle
    <option>Coventry
    <option>Ely
    <option>Hereford
    <option>Leeds
    <option>Lincoln
    <option>Manchester
    <option>Nottingham
    <option>Plymouth
    <option>Ripon
    <option>Sheffield
    <option>Stoke-on-Trent
    <option>Wakefield
    <option>Winchester
    <option>York
    <option>Bradford
    <option>Cambridge
    <option>Chester
    <option>Derby
    <option>Exeter
    <option>Kingston upon Hull
    <option>Leicester
    <option>Liverpool
    <option>Newcastle upon Tyne
    <option>Oxford
    <option>Portsmouth
    <option>Salford
    <option>Southampton
    <option>Sunderland
    <option>Wells
    <option>Wolverhampton 
    </optgroup> 
    
    <optgroup label="Northern Ireland"> 
    <option>Armagh
    <option>Lisburn
    <option>Belfast
    <option>Newry
    <option>Londonderry
    </optgroup> 
    
    <optgroup label="Scotland"> 
    <option>Aberdeen
    <option>Glasgow
    <option>Dundee
    <option>Inverness
    <option>Edinburgh
    <option>Stirling
    </optgroup> 
    <optgroup label="Unitary Authorities of Wales"> 
    <option>Bangor
    <option>St Davids
    <option>Cardiff
    <option>Swansea
    <option>Newport
    </optgroup> 
    </select>
    
    <input type="submit" value="submit" name="submit">
    </form>
    </html>
    
    <?php
    
    $location=$_POST['city'];
    include_once('simple_html_dom.php');
    
    $namedata[]=array($name1,$name2,$name3,$name4);
    
    
    for($j=1;$j<=4;$i++){
    foreach($namedata as $name){
    
    for($i=1;$i<=4;$i++){
    	$myurl='http://www.118.com/people-search.mvc?Supplied=true&Name='.$name[j].'&Location='.$location.'&pageSize=50&pageNumber='.$i;
    	echo $myurl;
    	if (function_exists('curl_init')) {
    		$ch = curl_init();
    	   curl_setopt($ch, CURLOPT_URL, $myurl);
    	   curl_setopt($ch, CURLOPT_HEADER, 0);
    	   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    	   curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0');
    	   $content = curl_exec($ch);
    	   curl_close($ch);
    	}
    	
    	$html=str_get_html($content);
    	foreach($html->find('div[class=searchResult]') as $result){
    		$resultdata[]=array(
    		'name' => $result->find('h2',0)->innertext,
    		'address' => $result->find('div[class=address]',0)->innertext,
    		'phone' => $result->find('div[class=telephoneNumber]',0)->innertext
    		);
    	}
    }
    }
    }
    //YOU CAN NOW DO WHATEVER YOU NEED WITH THE RESULT ARRAY
    echo "<table border=1><tr><th>Name</th><th>Address</th><th>City</th><th>Post Code</th><th>Phone</th></tr>";
    foreach($resultdata as $contact){
    	echo "<tr><td>".$contact[name]."</td><td>". str_replace('<br />', '</td><td>', $contact[address]) ."</td><td>".$contact[phone]."</td></tr>";
    }
    echo "</table>";
    ?>
    
    
    Code (markup):
     
    Last edited: Sep 7, 2010
    kolucoms6, Sep 7, 2010 IP
  12. vrktech

    vrktech Well-Known Member

    Messages:
    449
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    108
    #32
    I updated the code such as to run auto upto no of pages exist on source site. as well as the search button now works :)

    Let me know if it helps you :)

    
    
    <?php
    
    $name1= $_POST['name1'];
    $name2= $_POST['name2'];
    $name3= $_POST['name3'];
    $name4= $_POST['name4'];
    
    ?>
    
    <html>
    
    <link rel=stylesheet type="text/css" href="CSS/default0.css">
    
    <form action="<?php echo $_SERVER["PHP_SELF"]; ?>" method="post">
    
    
    
      Last Name 1 : <input type=text name=name1 value=<?php echo $_POST['name1']; ?>><br>
      Last Name 2 : <input type=text name=name2  value=<?php echo $_POST['name2']; ?>><br>
      Last Name 3 : <input type=text name=name3  value=<?php echo $_POST['name3']; ?> ><br>
      Last Name 4 : <input type=text name=name4  value=<?php echo $_POST['name4']; ?>><br>
    
    <label for="city">City : </label><span class="field"><select id="city" name="city"><option value=""></option> 
    <optgroup label=England> 
    <option>Bath
    <option>Brighton and Hove
    <option>Canterbury
    <option>Chichester
    <option>Durham
    <option>Gloucester
    <option>Lancaster
    <option>Lichfield
    <option>London
    <option>Norwich
    <option>Peterborough
    <option>Preston
    <option>Salisbury
    <option>St Albans
    <option>Truro
    <option>Westminster
    <option>Worcester
    <option>Birmingham
    <option>Bristol
    <option>Carlisle
    <option>Coventry
    <option>Ely
    <option>Hereford
    <option>Leeds
    <option>Lincoln
    <option>Manchester
    <option>Nottingham
    <option>Plymouth
    <option>Ripon
    <option>Sheffield
    <option>Stoke-on-Trent
    <option>Wakefield
    <option>Winchester
    <option>York
    <option>Bradford
    <option>Cambridge
    <option>Chester
    <option>Derby
    <option>Exeter
    <option>Kingston upon Hull
    <option>Leicester
    <option>Liverpool
    <option>Newcastle upon Tyne
    <option>Oxford
    <option>Portsmouth
    <option>Salford
    <option>Southampton
    <option>Sunderland
    <option>Wells
    <option>Wolverhampton 
    </optgroup> 
    
    <optgroup label="Northern Ireland"> 
    <option>Armagh
    <option>Lisburn
    <option>Belfast
    <option>Newry
    <option>Londonderry
    </optgroup> 
    
    <optgroup label="Scotland"> 
    <option>Aberdeen
    <option>Glasgow
    <option>Dundee
    <option>Inverness
    <option>Edinburgh
    <option>Stirling
    </optgroup> 
    <optgroup label="Unitary Authorities of Wales"> 
    <option>Bangor
    <option>St Davids
    <option>Cardiff
    <option>Swansea
    <option>Newport
    </optgroup> 
    </select>
    
    <input type="submit" value="submit" name="submit">
    </form>
    </html>
    
    <?php
    
    if($_POST['submit']=="submit"){
    
    	$location=$_POST['city'];
    	include_once('simple_html_dom.php');
    	
    	$namedata[]=array($name1,$name2,$name3,$name4);
    	
    	foreach($namedata as $name){
    	
    		$page_exist=true;
    		$i=0;
    		
    		while($page_exist){
    			$i+=1;
    			$myurl='http://www.118.com/people-search.mvc?Supplied=true&Name='.$name[$i-1].'&Location='.$location.'&pageSize=50&pageNumber='.$i;
    			if (function_exists('curl_init')) {
    				$ch = curl_init();
    			   curl_setopt($ch, CURLOPT_URL, $myurl);
    			   curl_setopt($ch, CURLOPT_HEADER, 0);
    			   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    			   curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0');
    			   $content = curl_exec($ch);
    			   curl_close($ch);
    			}
    			
    			$html=str_get_html($content);
    			if(!$html->find('div[class=searchResult]',0)==""){
    				foreach($html->find('div[class=searchResult]') as $result){
    					$resultdata[]=array(
    					'name' => $result->find('h2',0)->innertext,
    					'address' => $result->find('div[class=address]',0)->innertext,
    					'phone' => $result->find('div[class=telephoneNumber]',0)->innertext
    					);
    				}
    			}else $page_exist=false;
    		}
    	}
    	//YOU CAN NOW DO WHATEVER YOU NEED WITH THE RESULT ARRAY
    	echo "<table border=1><tr><th>Name</th><th>Address</th><th>City</th><th>Post Code</th><th>Phone</th></tr>";
    	foreach($resultdata as $contact){
    		echo "<tr><td>".$contact[name]."</td><td>". str_replace('<br />', '</td><td>', $contact[address]) ."</td><td>".$contact[phone]."</td></tr>";
    	}
    	echo "</table>";
    
    }
    ?>
    
    
    Code (markup):
     
    vrktech, Sep 7, 2010 IP
  13. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #33
    Works like a Charm.

    How about adding a facility as to Export to csv or Excel ?
     
    kolucoms6, Sep 7, 2010 IP
  14. vrktech

    vrktech Well-Known Member

    Messages:
    449
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    108
    #34
    this one will create testFile.csv

    
    <?php
    
    $name1= $_POST['name1'];
    $name2= $_POST['name2'];
    $name3= $_POST['name3'];
    $name4= $_POST['name4'];
    
    ?>
    
    <html>
    
    <link rel=stylesheet type="text/css" href="CSS/default0.css">
    
    <form action="<?php echo $_SERVER["PHP_SELF"]; ?>" method="post">
    
    
    
      Last Name 1 : <input type=text name=name1 value=<?php echo $_POST['name1']; ?>><br>
      Last Name 2 : <input type=text name=name2  value=<?php echo $_POST['name2']; ?>><br>
      Last Name 3 : <input type=text name=name3  value=<?php echo $_POST['name3']; ?> ><br>
      Last Name 4 : <input type=text name=name4  value=<?php echo $_POST['name4']; ?>><br>
    
    <label for="city">City : </label><span class="field"><select id="city" name="city"><option value=""></option> 
    <optgroup label=England> 
    <option>Bath
    <option>Brighton and Hove
    <option>Canterbury
    <option>Chichester
    <option>Durham
    <option>Gloucester
    <option>Lancaster
    <option>Lichfield
    <option>London
    <option>Norwich
    <option>Peterborough
    <option>Preston
    <option>Salisbury
    <option>St Albans
    <option>Truro
    <option>Westminster
    <option>Worcester
    <option>Birmingham
    <option>Bristol
    <option>Carlisle
    <option>Coventry
    <option>Ely
    <option>Hereford
    <option>Leeds
    <option>Lincoln
    <option>Manchester
    <option>Nottingham
    <option>Plymouth
    <option>Ripon
    <option>Sheffield
    <option>Stoke-on-Trent
    <option>Wakefield
    <option>Winchester
    <option>York
    <option>Bradford
    <option>Cambridge
    <option>Chester
    <option>Derby
    <option>Exeter
    <option>Kingston upon Hull
    <option>Leicester
    <option>Liverpool
    <option>Newcastle upon Tyne
    <option>Oxford
    <option>Portsmouth
    <option>Salford
    <option>Southampton
    <option>Sunderland
    <option>Wells
    <option>Wolverhampton 
    </optgroup> 
    
    <optgroup label="Northern Ireland"> 
    <option>Armagh
    <option>Lisburn
    <option>Belfast
    <option>Newry
    <option>Londonderry
    </optgroup> 
    
    <optgroup label="Scotland"> 
    <option>Aberdeen
    <option>Glasgow
    <option>Dundee
    <option>Inverness
    <option>Edinburgh
    <option>Stirling
    </optgroup> 
    <optgroup label="Unitary Authorities of Wales"> 
    <option>Bangor
    <option>St Davids
    <option>Cardiff
    <option>Swansea
    <option>Newport
    </optgroup> 
    </select>
    
    <input type="submit" value="submit" name="submit">
    </form>
    </html>
    
    <?php
    
    if($_POST['submit']=="submit"){
    
    	$location=$_POST['city'];
    	include_once('simple_html_dom.php');
    	
    	$namedata[]=array($name1,$name2,$name3,$name4);
    	
    	foreach($namedata as $name){
    	
    		$page_exist=true;
    		$i=0;
    		
    		while($page_exist){
    			$i+=1;
    			$myurl='http://www.118.com/people-search.mvc?Supplied=true&Name='.$name[$i-1].'&Location='.$location.'&pageSize=50&pageNumber='.$i;
    			if (function_exists('curl_init')) {
    				$ch = curl_init();
    			   curl_setopt($ch, CURLOPT_URL, $myurl);
    			   curl_setopt($ch, CURLOPT_HEADER, 0);
    			   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    			   curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0');
    			   $content = curl_exec($ch);
    			   curl_close($ch);
    			}
    			
    			$html=str_get_html($content);
    			if(!$html->find('div[class=searchResult]',0)==""){
    				foreach($html->find('div[class=searchResult]') as $result){
    					$resultdata[]=array(
    					'name' => $result->find('h2',0)->innertext,
    					'address' => $result->find('div[class=address]',0)->innertext,
    					'phone' => $result->find('div[class=telephoneNumber]',0)->innertext
    					);
    				}
    			}else $page_exist=false;
    		}
    	}
    	//YOU CAN NOW DO WHATEVER YOU NEED WITH THE RESULT ARRAY
    	$mycsvFile = "testFile.csv";
    	$fh = fopen($mycsvFile, 'w') or die("can't open file");
    	
    
    	echo "<table border=1><tr><th>Name</th><th>Address</th><th>City</th><th>Post Code</th><th>Phone</th></tr>";
    	foreach($resultdata as $contact){
    		$address=explode("<br />",$contact[address]);
    		$stringData = "\"".$contact[name]."\",\"".trim($address[0])."\",\"".trim($address[1])."\",\"".trim($address[1])."\",\"".trim($contact[phone])."\"\n";
    		fwrite($fh, $stringData);
    		echo "<tr><td>".$contact[name]."</td><td>". str_replace('<br />', '</td><td>', $contact[address]) ."</td><td>".$contact[phone]."</td></tr>";
    	}
    	echo "</table>";
    	
    	fclose($fh);
    }
    ?>
    
    Code (markup):
     
    vrktech, Sep 7, 2010 IP
  15. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #35
    $mycsvFile = "testFile.csv";
    $fh = fopen($mycsvFile, 'w') or die("can't open file");

    I always get can't open file
     
    kolucoms6, Sep 7, 2010 IP
  16. vrktech

    vrktech Well-Known Member

    Messages:
    449
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    108
    #36
    Its working fine with me, you need to check file permissions, try chmod to 777

    You can also try a javascript function to generate csv file on click of a button, (Sorry, I am not yet good with javascript)

    this might help you in that case http://codingforums.com/showpost.php?p=783184&postcount=5
     
    vrktech, Sep 7, 2010 IP
  17. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #37
    Here is my Modified Code :

    
    <?php
    
    $name1= $_POST['name1'];
    $name2= $_POST['name2'];
    $name3= $_POST['name3'];
    $name4= $_POST['name4'];
    
    ?>
    
    <html>
    
    <link rel=stylesheet type="text/css" href="CSS/default0.css">
    
    <form action="<?php echo $_SERVER["PHP_SELF"]; ?>" method="post">
    
    <script type="text/javascript">
    
    function CreateExcelSheet()
    {
    var x=myTable.rows
    
    var xls = new ActiveXObject("Excel.Application")
    xls.visible = true
    xls.Workbooks.Add
    for (i = 0; i < x.length; i++)
    {
    var y = x[i].cells
    
    for (j = 0; j < y.length; j++)
    {
    xls.Cells( i+1, j+1).Value = y[j].innerText
    }
    }
    }
    </script>
    
    
    Last Name 1 : <input type=text name=name1 value=<?php echo $_POST['name1']; ?>><br>
    Last Name 2 : <input type=text name=name2 value=<?php echo $_POST['name2']; ?>><br>
    Last Name 3 : <input type=text name=name3 value=<?php echo $_POST['name3']; ?> ><br>
    Last Name 4 : <input type=text name=name4 value=<?php echo $_POST['name4']; ?>><br>
    
    <label for="city">City : </label><span class="field"><select id="city" name="city"><option value=""></option> 
    <optgroup label=England> 
    <option>Bath
    <option>Brighton and Hove
    <option>Canterbury
    <option>Chichester
    <option>Durham
    <option>Gloucester
    <option>Lancaster
    <option>Lichfield
    <option>London
    <option>Norwich
    <option>Peterborough
    <option>Preston
    <option>Salisbury
    <option>St Albans
    <option>Truro
    <option>Westminster
    <option>Worcester
    <option>Birmingham
    <option>Bristol
    <option>Carlisle
    <option>Coventry
    <option>Ely
    <option>Hereford
    <option>Leeds
    <option>Lincoln
    <option>Manchester
    <option>Nottingham
    <option>Plymouth
    <option>Ripon
    <option>Sheffield
    <option>Stoke-on-Trent
    <option>Wakefield
    <option>Winchester
    <option>York
    <option>Bradford
    <option>Cambridge
    <option>Chester
    <option>Derby
    <option>Exeter
    <option>Kingston upon Hull
    <option>Leicester
    <option>Liverpool
    <option>Newcastle upon Tyne
    <option>Oxford
    <option>Portsmouth
    <option>Salford
    <option>Southampton
    <option>Sunderland
    <option>Wells
    <option>Wolverhampton 
    </optgroup> 
    
    <optgroup label="Northern Ireland"> 
    <option>Armagh
    <option>Lisburn
    <option>Belfast
    <option>Newry
    <option>Londonderry
    </optgroup> 
    
    <optgroup label="Scotland"> 
    <option>Aberdeen
    <option>Glasgow
    <option>Dundee
    <option>Inverness
    <option>Edinburgh
    <option>Stirling
    </optgroup> 
    <optgroup label="Unitary Authorities of Wales"> 
    <option>Bangor
    <option>St Davids
    <option>Cardiff
    <option>Swansea
    <option>Newport
    </optgroup> 
    </select>
    
    <input type="submit" value="submit" name="submit">
    </form>
    </html>
    
    <?php
    
    if($_POST['submit']=="submit"){
    
    $location=$_POST['city'];
    include_once('simple_html_dom.php');
    
    $namedata[]=array($name1,$name2,$name3,$name4);
    
    foreach($namedata as $name){
    
    $page_exist=true;
    $i=0;
    
    while($page_exist){
    $i+=1;
    $myurl='http://www.118.com/people-search.mvc?Supplied=true&Name='.$name[$i-1].'&Location='.$location.'&pageSize=50&pageNumber='.$i;
    if (function_exists('curl_init')) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $myurl);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0');
    $content = curl_exec($ch);
    curl_close($ch);
    }
    
    $html=str_get_html($content);
    if(!$html->find('div[class=searchResult]',0)==""){
    foreach($html->find('div[class=searchResult]') as $result){
    $resultdata[]=array(
    'name' => $result->find('h2',0)->innertext,
    'address' => $result->find('div[class=address]',0)->innertext,
    'phone' => $result->find('div[class=telephoneNumber]',0)->innertext
    );
    }
    }else $page_exist=false;
    }
    }
    //YOU CAN NOW DO WHATEVER YOU NEED WITH THE RESULT ARRAY
    
    //$mycsvFile = "testFile.csv";
    //$fh = fopen($mycsvFile, 'w') or die("can't open file");
    
    ?>
    
    <form><input type=button onclick=CreateExcelSheet() value=CreateExcelSheet></form>
    
    <?
    echo "<table border=1 id=myTable><tr><th>Name</th><th>Address</th><th>City</th><th>Post Code</th><th>Phone</th></tr>";
    foreach($resultdata as $contact){
    echo "<tr><td>".$contact[name]."</td><td>". str_replace('<br />', '</td><td>', $contact[address]) ."</td><td>".$contact[phone]."</td></tr>";
    }
    echo "</table>";
    
    }
    ?>
    
    Code (markup):
     
    kolucoms6, Sep 7, 2010 IP
  18. vrktech

    vrktech Well-Known Member

    Messages:
    449
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    108
    #38
    Hi, Your Modified Code works?

    I can't get it to work, there might me some problems with Javascript, I am not able to find it, Sorry
     
    vrktech, Sep 7, 2010 IP
  19. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #39
    :) Even I tried and Didnt work.

    Working on it to correct it.
     
    kolucoms6, Sep 7, 2010 IP
  20. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #40
    kolucoms6, Sep 7, 2010 IP