Need a regex to validating a Url and support and ()

后端 未结 1 454
野性不改
野性不改 2020-12-16 07:49

I\'m currently using the following regular expression to validation URLs:

^(?#Protocol)(?:(?:ht|f)tp(?:s?)\\:\\/\\/|~\\/|\\/)?(?#Username:Password)(?:\\w+:\\         


        
相关标签:
1条回答
  • 2020-12-16 08:36

    You're validating two things with the same regular expression:

    • Well formed -- Is it syntactically correct?
    • Plausible -- Are the protocol and top-level domain plausible?

    Separating these validations may be fruitful. You can use this regular expression to check that the URI is well-formed. It's from RFC 3986, Uniform Resource Identifiers (URI): Generic Syntax, appendix B (p. 50):

    ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
    

    If the URI matches this regular expression, it's well formed. The match groups give you the various pieces, which are:

    scheme    = $2
    authority = $4
    path      = $5
    query     = $7
    fragment  = $9
    

    Let's see what comes out of the sample URI you gave:

    2 (scheme)   : "http"
    4 (authority): "somedomain.com"
    5 (path)     : "/users/1234/images/Staff%20Photos%202008/FirstName%20LastName_1%20(Small).jpg"
    7 (query)    : nil
    9 (fragment) : nil
    

    Now that you've got the individual pieces, you can check each one for plausibility. For example, to get the TLD from the authority, apply this regular expression to the authority:

    \.([^.])$
    

    Group 1 gives you the TLD (com, org, etc.), which you can then check against your list.

    0 讨论(0)
提交回复
热议问题