Regex match a hostname — not including the TLD

落爺英雄遲暮 提交于 2019-12-06 03:31:52

Assuming your string is correctly formatted and doesn't include things like protocol [i.e. http://], you need all characters up to but not including the final .tld.

So this is the simplest way to do this. The trick with regular expressions is not to overcomplicate things:

.*(?=\.\w+)

This basically says, give me all characters in the set that is followed by [for example] .xxx, which will basically just return everything prior to the last period.

If you don't have lookahead, it would probably be easiest to use:

(\w+\.)+

which will give you everything up to and including the final '.' and then just trim the '.'.

Try this

/.+(?=\.\w+$)/

without support of the ?= it would be

/(.+)\.\w+$/

and then take the content of the first group

You could just strip off the tld:

s/\.[^\.]*$//;
(?<Domain>.*)\.(?<TLD>.*?)$
(.*)\.

This isn't really specific to tlds, it'll just give you everything before the last period in a line. If you want to be strict about valid TLDs or anything, it'll have to be written differently.

I'm not clear how you want to make the match work. but with the usual extended regex, you should be able to match any tld with [a-zA-Z]{2,3} So if you're trying to get the whole name other than the tld, something like

\(.\)\.[a-zA-Z]{2,3}$

should be close.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!