First lets define a \"URL\" according to my requirements.
The only protocols optionally allowed are http://
and https://
then a man
As a starting point you can use this one, it's for JS, but it's easy to convert it to work for PHP preg_match
.
/^(https?\://)?(www\.)?([a-z0-9]([a-z0-9]|(\-[a-z0-9]))*\.)+[a-z]+$/i
For PHP should work this one:
$reg = '@^(https?\://)?(www\.)?([a-z0-9]([a-z0-9]|(\-[a-z0-9]))*\.)+[a-z]+$@i';
This regexp anyway validates only the domain part, but you can work on this or split the url at the 1st slash '/'
(after "://"
) and validate separately the domain part and the rest.
BTW: It would validate also "http://www.domain.com.com"
but this is not an error because a subdomain url could be like: "http://www.subdomain.domain.com"
and it's valid! And there is almost no way (or at least no operatively easy way) to validate for proper domain tld with a regex because you would have to write inline into your regex all possible domain tlds ONE BY ONE like this:
/^(https?\://)?(www\.)?([a-z0-9]([a-z0-9]|(\-[a-z0-9]))*\.)+(com|it|net|uk|de)$/i
(this last one for instance would validate only domain ending with .com/.net/.de/.it/.co.uk). New tlds always come out, so you would have to adjust you regex everytimne a new tld comes out, that's a pain in the neck!