Following up to Regular expression to match hostname or IP Address? and using Restrictions on valid host names as a reference, what is the most readable, concise way to matc
I like the thoroughness of Tim Pietzcker's answer, but I prefer to offload some of the logic from regular expressions for readability. Honestly, I had to look up the meaning of those (?
"extension notation" parts. Additionally, I feel the "double-negative" approach is more obvious in that it limits the responsibility of the regular expression to just finding any invalid character. I do like that re.IGNORECASE allows the regex to be shortened.
So here's another shot; it's longer but it reads kind of like prose. I suppose "readable" is somewhat at odds with "concise". I believe all of the validation constraints mentioned in the thread so far are covered:
def isValidHostname(hostname):
if len(hostname) > 255:
return False
if hostname.endswith("."): # A single trailing dot is legal
hostname = hostname[:-1] # strip exactly one dot from the right, if present
disallowed = re.compile("[^A-Z\d-]", re.IGNORECASE)
return all( # Split by labels and verify individually
(label and len(label) <= 63 # length is within proper range
and not label.startswith("-") and not label.endswith("-") # no bordering hyphens
and not disallowed.search(label)) # contains only legal characters
for label in hostname.split("."))