regex help needed - Please

Discussion in 'PHP' started by jumpenjuhosaphat, Feb 7, 2007.

  1. #1
    I need to validate a user supplied URL, if it doesn't begin with http or https, I would like to add it. it can be any type of domain, but anything after the TLD needs to be stripped off, so
    www.disdomain.com/index.html
    Code (markup):
    would become
    http://www.disdomain.com
    Code (markup):
    I have a regular expression that I have made myself, but I'm pretty new to regex, so I don't know if it would even work

    ^((http|https|ftp):\/\/){1}
    (
    ([a-zA-Z0-9]([a-zA-Z0-9_-]\.)*)
    (([a-zA-Z0-9][a-zA-Z0-9_-]*)+ \/?)$
    )
    Code (markup):
    If that regex is good, how would I go about stripping the end of the URL, anything including a slash after the TLD and after the slash?
     
    jumpenjuhosaphat, Feb 7, 2007 IP
  2. krakjoe

    krakjoe Well-Known Member

    Messages:
    1,795
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    135
    #2
    
    function formatURL( $url )
    {
      if( !preg_match( "/^http(s)?/si", trim($url )) )
      {
        $url = "http://" . trim($url);
      }
      $url = parse_url( $url );
      return $url['scheme'] . "://" . $url['host'];
    }
    echo formatURL( "google.com/index.php?req=e" );
    
    
    PHP:
    That'll work .....
     
    krakjoe, Feb 7, 2007 IP
  3. jumpenjuhosaphat

    jumpenjuhosaphat Peon

    Messages:
    229
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    0
    #3
    That's perfect. Thank you very much.

    I tried giving you rep, but it seems as though I've already given you rep recently.

    Just one more question, if you don't mind. In the regex above, the /si, I know the i makes it case insensitive, but what about the s?
     
    jumpenjuhosaphat, Feb 7, 2007 IP
  4. krakjoe

    krakjoe Well-Known Member

    Messages:
    1,795
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    135
    #4
    krakjoe, Feb 7, 2007 IP