1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Pull data from other website

Discussion in 'PHP' started by kichus, Aug 25, 2007.

  1. #1
    Hi there,

    How can I pull data from other websites and get it stored in my database.I need the data for reporting process.

    Thanks
     
    kichus, Aug 25, 2007 IP
  2. Coder

    Coder Banned

    Messages:
    311
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #2
    What type of data you need to pull?
     
    Coder, Aug 25, 2007 IP
  3. Kuldeep1952

    Kuldeep1952 Active Member

    Messages:
    290
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    60
    #3
    In order to get data from other websites in PHP, you can use cURL. You can find more info at http://au.php.net/curl.
     
    Kuldeep1952, Aug 25, 2007 IP
  4. greenrob

    greenrob Peon

    Messages:
    58
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    or you can use the php file function and then post process teh data
     
    greenrob, Aug 26, 2007 IP
  5. Andy Peters

    Andy Peters Peon

    Messages:
    430
    Likes Received:
    22
    Best Answers:
    0
    Trophy Points:
    0
    #5
    But don't bother, use curl because it's faster and easier.

    $url="http://anything";
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
    $data = curl_exec ($ch);
    curl_close ($ch);
    // you can do something with $data like explode(); or a preg match regex to get the exact information you need
    echo $data;
    
    PHP:
     
    Andy Peters, Aug 26, 2007 IP
  6. ErectADirectory

    ErectADirectory Guest

    Messages:
    656
    Likes Received:
    65
    Best Answers:
    0
    Trophy Points:
    0
    #6
    If you have access to the function, file_get_contents() is faster (to type) than all those extra lines of cURL. Check it out ...

    $text = file_get_contents('http://www.mypage.com/') ; // scrape page into variable
    preg_match ("/<!--start product-->([^`]*?)<!--end product-->/", $text, $temp); // get data out of the page
    echo htmlentities($temp[0]) ; // spits out the 1st occurance of your data
    
    PHP:
    It can get more complicated than the above code but it really depends on what you need harvested.


    If you don't have access to file_get_contents() you could write a function to automate all the cURL stuff that will work the same as file_get_contents. I think cURL is a bit faster so it might be smart to go ahead and use it.

    function file_get_the_contents($url) {
      $ch = curl_init();
      $timeout = 10; // set to zero for no timeout
      curl_setopt ($ch, CURLOPT_URL, $url);
      curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
      curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
      $file_contents = curl_exec($ch);
      curl_close($ch);
      return $file_contents;
    }
    
    // now start your data harvesting
    $text = file_get_the_contents('http://www.mypage.com/') ; // scrape page into variable
    preg_match ("/<!--start product-->([^`]*?)<!--end product-->/", $text, $temp); // get data out of the page
    echo htmlentities($temp[0]) ; // spits out the 1st occurance of your data
    PHP:
     
    ErectADirectory, Aug 26, 2007 IP
  7. ritadebock

    ritadebock Peon

    Messages:
    344
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #7
    nice, good to know
     
    ritadebock, Aug 26, 2007 IP
  8. ssanders82

    ssanders82 Peon

    Messages:
    77
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #8
    The benefit of curl over file_get_contents is curl allows you to do stuff like post data, follow redirects, spoof user agent, accept cookies, etc.
     
    ssanders82, Aug 27, 2007 IP
  9. kichus

    kichus Peon

    Messages:
    188
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Hi All,

    I used curl_init() and it worked..

    used some RegExp to get the particular piece of information i needed.

    Special thanks to Andy Peters and ErectADirectory.
     
    kichus, Sep 16, 2007 IP
  10. dados

    dados Greenhorn

    Messages:
    14
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #10
    Can you please give me some advice for this problem, I do it site in CMS, I have boxes on site http://www.istoots.com and in this boxes must pull automatic information from this site http://www.dodtracker.com/ but only in couple boxes, can somebody help me to give me advice hove I can do this...

    Thanks.
     
    dados, Jun 3, 2009 IP
  11. radiotiger

    radiotiger Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Why do you want to pull data from others websites, will it not be a duplicate content ?
     
    radiotiger, Jan 29, 2010 IP
  12. radiotiger

    radiotiger Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Also there is something called WPRobot for automated posting of content on your wordpress. are you talking about such a thing?
     
    radiotiger, Jan 29, 2010 IP
  13. GeorgeBaker

    GeorgeBaker Peon

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #13
    Hi Andy,

    Hope your still in here sometimes :)

    How would the code look like if i need to login to a website?

    I have this code but it doesn't work.
    $username = 'xxxxxx';
    $password = 'yyyyy1';

    $url = 'http://www.fracsoft.com';

    $context = stream_context_create(array(
    'http' => array(
    'header' => "Authorization: Basic " . base64_encode("$username:$password")
    )
    ));
    $data = file_get_contents($url, false, $context);
    // echo $data
     
    GeorgeBaker, Jun 20, 2010 IP
  14. JavaDeveloper

    JavaDeveloper Peon

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #14
    I have created a program in Java that pulls data from a website.
    To read in page:

    public static BufferedReader read(String url) throws Exception
    {
        return new BufferedReader
       (
         new InputStreamReader
         (
           new URL(url).openStream()
         )
       );
    }
    
    Code (markup):
    Then I find instances of particular exclusive chars to find the starting point of my data

    int start = line.indexOf(">") + 1;
    Code (markup):
    and then afterward, I find instances of the next char to end mark the end of the information I am looking for

    int start = line.indexOf("/") - 4;
    Code (markup):
    then I run a loop from the start to the finish and append a String

    String whatIwant = "";
    
    for (int i = start; i < end; i++)
    {
           whatIwant = (whatIwant + line.charAt(i));
    }
    Code (markup):

    Then I finally print that data to a file or screen.

    This may be slow but I have not had any trouble getting all the data in a situation where the pages announce the changing value in the url... I increment the value (or pull the data from a predefined text file) and reinitiate the URL from another section of code... The advantage is that it actually loads the entire page to gather the data so you are able to capture anything that is sent to the presentation layer without risking 'hacking' the website. Simply put, for them to block this, they would have to block an address for accessing their website to many times. As it stands, I am increasing their ranking anyway.

    Any questions will not be answered unless written on a $20 bill and sent to my address.
    (for those not familiar with Java... this will be wasted... if you are, you have enough information to do what I have done.)
     
    JavaDeveloper, Jan 8, 2012 IP
  15. kris.baj422

    kris.baj422 Peon

    Messages:
    2
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #15
    file_gets_contents() is the best option....
     
    kris.baj422, Jan 8, 2012 IP
  16. kris.baj422

    kris.baj422 Peon

    Messages:
    2
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #16
    I think he need only PHP Code.
     
    kris.baj422, Jan 8, 2012 IP
  17. gabstero

    gabstero Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #17
    hi gang,

    sorry to resurrect this thread, but I also would like to pull certain snippets of text from a web page that's wrapped in a div tag that has a class. Something similar to this:

    <span class="OutOfStock">Out of stock</span>
    HTML:
    is this possible?

    thanks,
    gabstero
     
    gabstero, Sep 4, 2012 IP
  18. DomainerHelper

    DomainerHelper Well-Known Member

    Messages:
    445
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    100
    #18
    Yes. It is possible to get any data from a page.
     
    DomainerHelper, Sep 4, 2012 IP
  19. gabstero

    gabstero Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #19
    can you point me to a source, or code, please?

    thanks,
    gabstero
     
    gabstero, Sep 4, 2012 IP
  20. DomainerHelper

    DomainerHelper Well-Known Member

    Messages:
    445
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    100
    #20
    Way too much to teach you for free bro. You need to learn regular expressions (regex) and functions like preg_match_all(). Any links I send would be via google, which you are capable of.
     
    DomainerHelper, Sep 4, 2012 IP