parse website for content

Discussion in 'PHP' started by moussa854, Apr 30, 2009.

  1. #1
    Hi,
    I need to crawl a website and get data (content). Example: crawl "www.manta.com" to create database with the companies information

    http://www.manta.com/coms2/dnbcompany_dn4yk8
     
    moussa854, Apr 30, 2009 IP
  2. mallorcahp

    mallorcahp Peon

    Messages:
    141
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #2
    use file_get_contents() or file() to read the page into string or array and then extract what you need with regex or something ...
     
    mallorcahp, Apr 30, 2009 IP
  3. kusal

    kusal Peon

    Messages:
    91
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Sometimes file_get_contents() is disabled by your hosting if so u can try using curl
     
    kusal, Apr 30, 2009 IP
  4. Wavfact

    Wavfact Peon

    Messages:
    19
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Are you askign for someone to help build this for you? Because that is a tall order for a forum thread..

    Crawling sites is an easy task if you know PHP or Perl. But would be many lines of code and many hours of work to assemble.
     
    Wavfact, Apr 30, 2009 IP
  5. deadtrunk

    deadtrunk Banned

    Messages:
    27
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    i suggest you use curl
     
    deadtrunk, May 1, 2009 IP
  6. nayes84

    nayes84 Member

    Messages:
    34
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    43
    #6
    nayes84, May 1, 2009 IP