How to validate a domain name using Regex & Php?

前端 未结 6 687
失恋的感觉
失恋的感觉 2020-12-03 05:27

I want a solution to validate only domain names not full urls, The following example is what i\'m looking for:

domain.com -> true
domain.net -> true
do         


        
6条回答
  •  猫巷女王i
    2020-12-03 06:04

    The accepted answer is incomplete/wrong.

    The regex pattern;

    • should NOT validate domains such as:
      -domain.com, domain--.com, -domain-.-.com, domain.000, etc...

    • should validate domains such as:
      schools.k12, newTLD.clothing, good.photography, etc...

    After some further research; below is the most correct, cross-language and compact pattern I could come up with:

    ^(?!\-)(?:(?:[a-zA-Z\d][a-zA-Z\d\-]{0,61})?[a-zA-Z\d]\.){1,126}(?!\d+)[a-zA-Z\d]{1,63}$
    

    This pattern conforms with most* of the rules defined in the specs:

    • Each label/level (splitted by a dot) may contain up to 63 characters.
    • The full domain name may have up to 127 levels.
    • The full domain name may not exceed the length of 253 characters in its textual representation.
    • Each label can consist of letters, digits and hyphens.
    • Labels cannot start or end with a hyphen.
    • The top-level domain (extension) cannot be all-numeric.

    Note 1: The full domain length check is not included in the regex. It should be simply checked by native methods e.g. strlen(domain) <= 253.
    Note 2: This pattern works with most languages including PHP, Javascript, Python, etc...

    See DEMO here (for JS, PHP, Python)

    More Info:

    • The regex above does not support IDNs.

    • There is no spec that says the extension (TLD) should be between 2 and 6 characters. It actually supports 63 characters. See the current TLD list here. Also, some networks do internally use custom/pseudo TLDs.

    • Registration authorities might impose some extra, specific rules which are not explicitly supported in this regex. For example, .CO.UK and .ORG.UK must have at least 3 characters, but less than 23, not including the extension. These kinds of rules are non-standard and subject to change. Do not implement them if you cannot maintain.

    • Regular Expressions are great but not the best effective, performant solution to every problem. So a native URL parser should be used instead, whenever possible. e.g. Python's urlparse() method or PHP's parse_url() method...

    • After all, this is just a format validation. A regex test does not confirm that a domain name is actually configured/exists! You should test the existence by making a request.

    Specs & References:

    • IETF: RFC1035
    • IETF: RFC1123
    • IETF: RFC2181
    • IETF: RFC952
    • Wikipedia: Domain Name System

    UPDATE (2019-12-21): Fixed leading hyphen with subdomains.

提交回复
热议问题