Screen Scraping

Discussion in 'PHP' started by stephan2307, Nov 9, 2012.

  1. #1
    Been thinking of writing a little ebook about screen scraping using PHP.

    Just wondering if there would be any interest on this topic.
     
    stephan2307, Nov 9, 2012 IP
  2. scottlpool2003

    scottlpool2003 Well-Known Member

    Messages:
    1,708
    Likes Received:
    49
    Best Answers:
    9
    Trophy Points:
    150
    #2
    I believe there would be interest in screen scraping, but perhaps not so much with PHP. I've played about with it before, and the problem I had was that a lot of hosts disallow certain things such as DOM, memory usage etc.

    If you have a way of doing this preventing large memory usage, I'd be interested in learning!
     
    scottlpool2003, Nov 9, 2012 IP
  3. stephan2307

    stephan2307 Well-Known Member

    Messages:
    1,277
    Likes Received:
    33
    Best Answers:
    7
    Trophy Points:
    150
    #3
    I never had any problems with screen scraping using php. I am on a shared server and I have scraped sites with more than 1 000 000 pages.

    Never use dom always use preg_match_all and explode.

    Might give private tuition if interest is there.
     
    stephan2307, Nov 9, 2012 IP
  4. NetStar

    NetStar Notable Member

    Messages:
    2,471
    Likes Received:
    541
    Best Answers:
    21
    Trophy Points:
    245
    #4
    If you are heavily relying on regex for screen scraping you are using the wrong tool for the job and probably could benefit from reading someone elses tutorial on screen scraping.

    Use DOM to parse data and only use regex to clean up or match specifically...
     
    NetStar, Nov 10, 2012 IP
  5. ironmankho

    ironmankho Active Member

    Messages:
    393
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    55
    #5
    i am using explode for screen scraping
     
    ironmankho, Nov 11, 2012 IP
  6. stephan2307

    stephan2307 Well-Known Member

    Messages:
    1,277
    Likes Received:
    33
    Best Answers:
    7
    Trophy Points:
    150
    #6
    I am doing quite ok thank you.
     
    stephan2307, Nov 12, 2012 IP