Getting threads from forums

Discussion in 'PHP' started by seokochin, Nov 19, 2008.

  1. #1
    I am in a way of developing a new script.For that I need to get each and every thread of forums.

    I mean if I am entering :

    $url = "http://forums.digitalpoint.com/";

    I should get back all the threads of DP forum.

    Is it possible ?
     
    seokochin, Nov 19, 2008 IP
  2. born2hack

    born2hack Banned

    Messages:
    294
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Threads: 1,070,926, Posts: 9,452,724, Members: 225,426, Active Members: 61,009

    I would suggets not doing it but....its your choice, even if the script loads 10 threads/second the script will take 300 hours+ to fully load.

    And by the way IT IS POSSIBLE, but recommended NOT TO BE DONE. And the code shall be pretty large too.
     
    born2hack, Nov 19, 2008 IP
  3. seokochin

    seokochin Banned

    Messages:
    298
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I don't want to take the threads from big forums like DP.But just want to know the method or idea.

    Can you please suggest ?
     
    seokochin, Nov 19, 2008 IP
  4. ads2help

    ads2help Peon

    Messages:
    2,142
    Likes Received:
    67
    Best Answers:
    1
    Trophy Points:
    0
    #4
    CURL can do it.
     
    ads2help, Nov 19, 2008 IP
  5. born2hack

    born2hack Banned

    Messages:
    294
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Yup cURL can do it easily, I have posted a small tut about cURl on my blog you can find in my singnature although that shall not be enough for retrieving the threads. You will need to use preg_match and preg_match_all to retreive sub-categories/forum addresses, then again to retrieve the thread page addressed then again to retrieve its code. ;) good luck!
     
    born2hack, Nov 19, 2008 IP
  6. michaelhassler

    michaelhassler Guest

    Messages:
    13
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    With cURL you'd be able to simulate a web browser almost 100% perfectly. There are of course some javascript problems which may be difficult to come around, but generally you should be able to do about anything else.
     
    michaelhassler, Nov 19, 2008 IP
  7. seokochin

    seokochin Banned

    Messages:
    298
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7

    I know curl.But not getting the idea.If you can help it will be great.


    Your site is not loading
     
    seokochin, Nov 19, 2008 IP
  8. born2hack

    born2hack Banned

    Messages:
    294
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #8
    sorry shifting it to my server, will be fixed soon ;)
     
    born2hack, Nov 19, 2008 IP
  9. seokochin

    seokochin Banned

    Messages:
    298
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    I can see nothing abt threading getting i your blog.Please help if you can.
     
    seokochin, Nov 19, 2008 IP
  10. born2hack

    born2hack Banned

    Messages:
    294
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #10
    I havent posted the exact method, see the cURl basics on how to get pages from the internet. Then use peg_match to retreive forum and thread urls. Google both "PHP: cURL" and "preg_match" or "preg_match_all" for more info.
     
    born2hack, Nov 19, 2008 IP
  11. seokochin

    seokochin Banned

    Messages:
    298
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #11
    can you please tell the idea of it ?
     
    seokochin, Nov 19, 2008 IP
  12. elias_sorensen

    elias_sorensen Well-Known Member

    Messages:
    852
    Likes Received:
    20
    Best Answers:
    0
    Trophy Points:
    110
    #12
    Google cURL ;-)
    You can even log in to the DP forum with cURL, and crawl the only-users categories :)
     
    elias_sorensen, Nov 20, 2008 IP
  13. seokochin

    seokochin Banned

    Messages:
    298
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #13
    That all I can do ......!
    But want to know how to crawl each and every thread.
     
    seokochin, Nov 20, 2008 IP
  14. born2hack

    born2hack Banned

    Messages:
    294
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Alright here it is step by step

    1) Open desired forum using cURL.
    2) Use preg_match_all for getting the exact links to forums.
    3) Open each forum and take exact link of the posts again using preg_match_all.
    4) You can either just us ethe URL or again retreive the specific words written by the author using preg_match.

    I know this is a long and tedious job, but good luck ;)
     
    born2hack, Nov 20, 2008 IP