List All URLs on a webpage

Discussion in 'PHP' started by SNaRe, Nov 2, 2006.

  1. #1
    How can i list all links in a webpage ?
    I tried something but i couldn't do it . I'm waiting your help
     
    SNaRe, Nov 2, 2006 IP
  2. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #2
    T0PS3O, Nov 2, 2006 IP
  3. SNaRe

    SNaRe Well-Known Member

    Messages:
    1,132
    Likes Received:
    32
    Best Answers:
    0
    Trophy Points:
    165
    #3
    I don't have too much php experience.
    I see a function here.
    But how will i use it ? It needs parameters but i was thinking it will only want the link that i want to parse. But it wants 4 arguments.
    Can you write me an example code for me .
     
    SNaRe, Nov 2, 2006 IP
  4. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #4
    It's not easy to help with the information you're providing. You could do something like this. (If I understand you right)

    
    function get_urls_of($url)
    {
    	if (!($content = @file_get_contents($url)))
    	{
    		exit('Could not load page '. $url);
    	}
    	
    	preg_match_all('/((ht|f)tps?:\/\/[^\s\r\n\"\']*)/si', $content, $urls);
    	
    	return $urls[1];
    }
    
    // Usage example:
    
    $urls = get_urls_of('http://www.somepage.com');
    
    print_r($urls);
    
    
    PHP:
    Untested but should work.
     
    nico_swd, Nov 2, 2006 IP
  5. SNaRe

    SNaRe Well-Known Member

    Messages:
    1,132
    Likes Received:
    32
    Best Answers:
    0
    Trophy Points:
    165
    #5
    You understood me right this is what i want. There is an error with the code
    Parse error: parse error, unexpected T_STRING in C:\Program Files\xampp\htdocs\parse.php on line 4
    Code (markup):
     
    SNaRe, Nov 2, 2006 IP
  6. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #6
    You have to remove the line numbers that the forum adds.
     
    nico_swd, Nov 2, 2006 IP
  7. SNaRe

    SNaRe Well-Known Member

    Messages:
    1,132
    Likes Received:
    32
    Best Answers:
    0
    Trophy Points:
    165
    #7
    i have already removed it. Problem is not about it
     
    SNaRe, Nov 2, 2006 IP
  8. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #8
    Hm, the code works just fine for me. What is line 4 of your code?
     
    nico_swd, Nov 2, 2006 IP
  9. SNaRe

    SNaRe Well-Known Member

    Messages:
    1,132
    Likes Received:
    32
    Best Answers:
    0
    Trophy Points:
    165
    #9
    My code
    <?
    function get_urls_of($url)
    {
        if (!($content = @file_get_contents($url)))
        {
            exit('Could not load page '. $url);
        }
        
        preg_match_all('/((ht|f)tps?:\/\/[^\s\r\n\"\']*)/si', $content, $urls);
        
        return $urls[1];
    }
     
    // Usage example:
     
    $urls = get_urls_of('http://www.google.com');
     
    print_r($urls); 
    ?>
    Code (markup):
    Line 4: if (!($content = @file_get_contents($url)))
     
    SNaRe, Nov 2, 2006 IP
  10. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #10
    You must be using an old version of PHP or something. It works fine for me.

    Anyway, give this a try.
    
    <?php
    
    function get_urls_of($url)
    {
        $content = @file_get_contents($url) OR exit('Could not load page '. $url);
        
        preg_match_all('/((ht|f)tps?:\/\/[^\s\r\n\"\']*)/si', $content, $urls);
        
        return $urls[1];
    }
     
    // Usage example:
     
    $urls = get_urls_of('http://www.google.com');
     
    print_r($urls); 
    ?>
    
    Code (markup):
     
    nico_swd, Nov 2, 2006 IP
  11. SNaRe

    SNaRe Well-Known Member

    Messages:
    1,132
    Likes Received:
    32
    Best Answers:
    0
    Trophy Points:
    165
    #11
    Now works great. ı couldn't understand why it gave error but it's working now thank you.
     
    SNaRe, Nov 2, 2006 IP
  12. Freewebspace

    Freewebspace Notable Member

    Messages:
    6,213
    Likes Received:
    370
    Best Answers:
    0
    Trophy Points:
    275
    #12
    Thank you very much for the code
    I wasted 24 hours for creating a new one

    Then only I Searched DP and found this code!

    It will be helpful if some one explains this
     
    Freewebspace, Mar 11, 2007 IP
  13. bobby9101

    bobby9101 Peon

    Messages:
    3,292
    Likes Received:
    134
    Best Answers:
    0
    Trophy Points:
    0
    #13
    if you are using javascript, you can do this very easily :D (only the current page)
     
    bobby9101, Mar 11, 2007 IP