How to validate an url?

Discussion in 'PHP' started by hhheng, May 25, 2007.

  1. #1
    parse_url() can only break the url into several parts, but it can't check whether an url is workable.

    What's the function to check whether an url is valid?
     
    hhheng, May 25, 2007 IP
  2. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #2
    What do you mean with "is workable"? Do you want to verify if an URL exists, meaning, if it's online and accessible?

    If that's the case, have a look at this topic.


    If you want to verify if an URL is in a valid format, you can use a regular expression, such as:
    
    function is_valid_url($url)
    {
        return preg_match('/^
    		(ht|f)tps?:\/\/                   # scheme
    		([a-z0-9][a-z0-9-\.]*[a-z0-9]\.)? # www or subdomain (optional)
    		[a-z0-9][a-z0-9-]*[a-z0-9]        # domain
    		\.[a-z]{2,4}                      # extension
    		(:\d{2,4})?                       # port (optional)
    		\/[\w-\.~!\*\'\(\);:@&=           # path
    		\+\$,\/\?%\[\]]+                  # path (continued)
    		(\#[\w-\s]+)?                     # anchor (optional)
    		$/xi', $url);
    }
    
    
    PHP:
    Usage example:
    
    $url = 'http://www.google.co.uk:80/url?sa=t&ct=res&cd=2&url=http%3A%2F%2Fwww.spanish-teaching.com%2Fblog%2F_archives%2F2007%2F2%2F5%2F2672413.html&ei=o_RWRunZCoH80gToiKHuBg&usg=AFrqEzeiWV9Iuw547VUbJbrJCGvC-PS7Zg&sig2=h_1c2mSijA6VeWhCHGQTDA#woop';
    
    if (is_valid_url($url))
    {
    	echo 'Yeah';
    }
    
    
    PHP:
     
    nico_swd, May 25, 2007 IP
  3. projectshifter

    projectshifter Peon

    Messages:
    394
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    0
    #3
    nico's solution will work in "most" cases, but assuming you're not using this on every page, the best option is fsockopen. http://php.net/fsockopen It'll give you an error code if it can't open the page. Otherwise it can be a valid URL by syntax, but it could be like www.youSuckDontTalkToMe.com, and not a real domain. hope this helps
     
    projectshifter, May 25, 2007 IP
  4. lwbbs

    lwbbs Well-Known Member

    Messages:
    331
    Likes Received:
    13
    Best Answers:
    0
    Trophy Points:
    108
    #4
    Also need to parse the return page. Maybe the return page is "Page not found".

     
    lwbbs, May 25, 2007 IP
  5. nico_swd

    nico_swd Prominent Member

    Messages:
    4,153
    Likes Received:
    344
    Best Answers:
    18
    Trophy Points:
    375
    #5
    You can get this information from the headers. Which does the script in the topic I linked to.
     
    nico_swd, May 25, 2007 IP