PHP URL Validation

Discussion in 'Programming' started by tdd1984, Sep 22, 2007.

  1. #1
    Guys I'm trying to validate urls froma form, for example someone enters http://yahoo.com that would not be a valid url, I want to make sure they insert http://www.yahoo.com which I have found the regular expressions for that, but the problem is http://subdomain.yahoo.com does not work properly.

    So I need for them to be able to insert http://www.yahoo.com or http://subdomain.yahoo.com

    Reason I'm doing this is because I'm running a check in the database using parse_url to make sure there is no duplicate entries, so no 2 urls can be entered twice would you guys have any clue on how I can do this?
     
    tdd1984, Sep 22, 2007 IP
  2. k2pi

    k2pi Peon

    Messages:
    15
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    what regexp do you use ?

    Your Database can have a UNIQUE constraint on your url field and can assure you the unicity.
     
    k2pi, Sep 22, 2007 IP
  3. farjam

    farjam Peon

    Messages:
    368
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #3
    do you parse url for www?

    I mean if url has www it is true (1)

    and if url doesn`t have www , it is false (0)

    to avoid enetering a url without www?

    so because of that ur php program supposes that a subdomain url is incorrect?

    Am I right?
     
    farjam, Sep 22, 2007 IP
  4. tdd1984

    tdd1984 Well-Known Member

    Messages:
    2,357
    Likes Received:
    42
    Best Answers:
    0
    Trophy Points:
    150
    #4
    Yes exactly the subdomain is wher I'm having the problem :(
     
    tdd1984, Sep 22, 2007 IP
  5. farjam

    farjam Peon

    Messages:
    368
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #5
    It is simple to solve.

    in parsing url "." is the key.

    pay attention:

    for example in parsing http://yahoo.com we have only one "."

    but in subdomains like http://messenger.yahoo.com we have two "."

    so the key is that if in parsing invalid URL you have only one "." it is like

    http://yahoo.com but if we have two or more "." like http://

    messenger.yahoo.com
    it is a subdomain.

    but we have some exceptions like:

    http://yourdoamin.com.au

    it is a domain not a subdomain but has two "."

    but in many cases you can use the assumption which I mentioned.
     
    farjam, Sep 22, 2007 IP
  6. krt

    krt Well-Known Member

    Messages:
    829
    Likes Received:
    38
    Best Answers:
    0
    Trophy Points:
    120
    #6
    Why not check for a subdomain and add www. if it is not there?

    $url = rtrim(preg_replace("~(.+?://)([^.]+\.[^.]+)/?~", '$1www.$2', $url), '/');
    PHP:
    rtrim() is to ensure only one version of the URL is inserted (instead of two with and without the trailing slash)
     
    krt, Sep 22, 2007 IP