1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Extract Data from Web Page

Discussion in 'PHP' started by kolucoms6, Sep 5, 2010.

  1. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #61
    kolucoms6, Sep 11, 2010 IP
    SEMrush
  2. vrktech

    vrktech Active Member

    Messages:
    449
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    58
    #62
    It looks like you get an php time out, and also whitepages site you had gives the same results with the page numbers after the record ends like

    http://whitepages.com.au/wp/resSearch.do?subscriberName=williams&location=Sydney&page=1 has only 12 results and when you go to http://whitepages.com.au/wp/resSearch.do?subscriberName=williams&location=Sydney&page=2 it shows the same results.

    Thus the script runs endless

    to avoid, you need to catch the no of results displayed and use it in your while loop something like
    <?php
    
    ini_set('max_execution_time', 200);
    
    if($_POST['name1']) $name[]= $_POST['name1'];
    if($_POST['name2']) $name[]= $_POST['name2'];
    if($_POST['name3']) $name[]= $_POST['name3'];
    if($_POST['name4']) $name[]= $_POST['name4'];
    if($_POST['name5']) $name[]= $_POST['name5'];
    ?>
    
    <html>
    <head>
    
    <link rel=stylesheet type="text/css" href="CSS/default0.css">
    
    </head>
    
    
    <body>
    <form action="<?php echo $_SERVER["PHP_SELF"]; ?>" method="post">
    
    Last Name 1 : <input type=text name=name1 value=<?php echo $_POST['name1']; ?>><br>
    Last Name 2 : <input type=text name=name2 value=<?php echo $_POST['name2']; ?>><br>
    Last Name 3 : <input type=text name=name3 value=<?php echo $_POST['name3']; ?> ><br>
    Last Name 4 : <input type=text name=name4 value=<?php echo $_POST['name4']; ?>><br>
    Last Name 5 : <input type=text name=name5 value=<?php echo $_POST['name5']; ?>><br>
    
    <label for="city">City : </label><span class="field"><select id="city" name="city"><option value=""></option> 
    <optgroup label="New South Wales">
    <option>Sydney
    <option>Newcastle
    <option>Central Coast
    <option>Wollongong
    <option>Maitland
    <option>Wagga Wagga
    <option>Port Macquarie
    <option>Tamworth
    <option>Orange
    <option>Dubbo
    <option>Bathurst
    <option>Nowra-Bomaderry
    <option>Lismore
    <option>Coffs Harbour
    <option>Richmond-Windsor
    <option>Albury-Wodonga
    <option>Darwin
    <option>Palmerston 
    </optgroup> 
    
    <optgroup label=" Queensland"> 
    <option>Brisbane
    <option>Sunshine Coast
    <option>Townsville-Thuringowa
    <option>Cairns
    <option>Toowoomba
    <option>Mackay
    <option>Rockhampton
    <option>Bundaberg
    <option>Hervey Bay
    <option>Gladstone
    <option>Gold Coast-Tweed Heads
    </optgroup> 
    
    <optgroup label="South Australia"> 
    <option>Adelaide
    <option>Mount Gambier
    </optgroup> 
    
    <optgroup label="Tasmania"> 
    <option>Hobart
    <option>Launceston
    </optgroup> 
    
    <optgroup label="Victoria"> 
    <option>Melbourne
    <option>Geelong
    <option>Ballarat
    <option>Bendigo
    <option>Shepparton-Mooroopna
    <option>Melton
    <option>Mildura
    <option>Sunbury
    <option>Warrnambool
    </optgroup>
    
    <optgroup label="Western Australia"> 
    <option>Perth
    <option>Mandurah
    <option>Rockingham
    <option>Bunbury
    <option>Kalgoorlie-Boulder
    <option>Geraldton
    <option>Albany
    </optgroup>
    
    
    </select>
    
    <input type="submit" value="submit" name="submit">
    </form>
    
    <?php
    
    if($_POST['submit']=="submit"){
    if(!$_POST['city']) die('No City selected');
    $location=$_POST['city'];
    include_once('simple_html_dom.php');
    
    
    for($loop=0; $loop<count($name); $loop++){
    
        $i=0;
        $result_page_count=0;
        while($result_page_count<=$i){
            $i+=1;
        
            
            
            $myurl='http://whitepages.com.au/wp/resSearch.do?subscriberName='.$name[$loop].'&location='.$location.'&page='.$i;
            
            //$myurl='http://www.118.com/people-search.mvc?Supplied=true&Name='.$name[$loop].'&Location='.$location.'&pageSize=50&pageNumber='.$i;
            echo $myurl."<br>";
            
            if (function_exists('curl_init')) {
            $ch = curl_init();
            curl_setopt($ch, CURLOPT_URL, $myurl);
            curl_setopt($ch, CURLOPT_HEADER, 0);
            curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
            curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0');
            $content = curl_exec($ch);
            curl_close($ch);
            }
            
            $html=str_get_html($content);
            
            if(!$result_page_count){
                echo $result_count=$html->find('span[id=listingResults] span[class=importantInformation]',0)->innertext;
                if($result_count%20 > 0)
                    echo $result_page_count=(int)($result_count/20)+1;
                else
                    echo $result_page_count=$result_count/20;
            }
            //die();
            
            $html=$html->find('div[id=entries]',0);
            foreach($html->find('div[class=entry]') as $result){
                
                $resultdata[]=array(
                'name' => $result->find('h2[class=nameDetails]',0)->innertext,
                'streetLine' => $result->find('span[class=streetLine]',0)->innertext,
                'locality' => $result->find('span[class=locality]',0)->innertext,
                'state' => $result->find('span[class=state]',0)->innertext,
                'postcode' => $result->find('span[class=postcode]',0)->innertext,
                'phone' => $result->find('span[class=phoneNoText]',0)->innertext
                );
                
            }
            
        }
    }
    
    //YOU CAN NOW DO WHATEVER YOU NEED WITH THE RESULT ARRAY
    foreach($resultdata as $contact){
    echo $contact[name];
    $stringData .= $contact[name].",".$contact[streetline].",".$contact[locality].",".$contact[state].",".$contact[postcode].",".$contact[phone]."\n";
    }
    }
    ?>
    
    <textarea rows=50 cols=100 locked=true><?php echo $stringData; ?></textarea>
    
    </body></html>
    Code (markup):
    Sorry, I have not tested it, I dont have time for now
     
    vrktech, Sep 12, 2010 IP
  3. Eager2Seo

    Eager2Seo Member

    Messages:
    72
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    48
    #63
    nevermind, wrong post
     
    Last edited: Sep 12, 2010
    Eager2Seo, Sep 12, 2010 IP
  4. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #64
    I tried the below Script, it RETURNS me blank page.

     
    kolucoms6, Nov 28, 2010 IP
  5. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #65
    I am getting

    When I did :

     
    kolucoms6, Dec 1, 2010 IP
  6. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #66
    Any Help ?
     
    kolucoms6, Dec 28, 2010 IP
  7. dsdf

    dsdf Peon

    Messages:
    35
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #67
    Looks like your hosting doesn't support allow_url_fopen. Try curl. Read FAQ on simplehtmldom manual
     
    dsdf, Dec 30, 2010 IP
  8. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #68
    Even when I try the same in my Local server , it give same the same Error.
     
    kolucoms6, Jan 1, 2011 IP
  9. vrktech

    vrktech Active Member

    Messages:
    449
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    58
    #69
    hi sorry, I was away for a long days, Have you checked if curl working?

    try
    before
    and check if it returns the correct html source
     
    vrktech, Jan 27, 2011 IP
  10. dsdf

    dsdf Peon

    Messages:
    35
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #70
    Although you tested it from your local server, it still have problem if you don't enable curl.

    Use phpinfo() to check, is you already have curl library? If not, open php.ini and change
    ;extension=php_curl.dll
    to
    extension=php_curl.dll
     
    dsdf, Jan 30, 2011 IP
  11. kolucoms6

    kolucoms6 Active Member

    Messages:
    1,198
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    75
    #71
    Any Master here in simplehtmldom Object ??

    Need some Help ...
     
    kolucoms6, Jun 11, 2011 IP