Extract hostname name from string

后端 未结 28 1840
情歌与酒
情歌与酒 2020-11-22 07:15

I would like to match just the root of a URL and not the whole URL from a text string. Given:

http://www.youtube.co         


        
28条回答
  •  春和景丽
    2020-11-22 07:52

    parse-domain - a very solid lightweight library

    npm install parse-domain

    const { fromUrl, parseDomain } = require("parse-domain");
    

    Example 1

    parseDomain(fromUrl("http://www.example.com/12xy45"))
    
    { type: 'LISTED',
      hostname: 'www.example.com',
      labels: [ 'www', 'example', 'com' ],
      icann:
       { subDomains: [ 'www' ],
         domain: 'example',
         topLevelDomains: [ 'com' ] },
      subDomains: [ 'www' ],
      domain: 'example',
      topLevelDomains: [ 'com' ] }
    

    Example 2

    parseDomain(fromUrl("http://subsub.sub.test.ExAmPlE.coM/12xy45"))
    
    { type: 'LISTED',
      hostname: 'subsub.sub.test.example.com',
      labels: [ 'subsub', 'sub', 'test', 'example', 'com' ],
      icann:
       { subDomains: [ 'subsub', 'sub', 'test' ],
         domain: 'example',
         topLevelDomains: [ 'com' ] },
      subDomains: [ 'subsub', 'sub', 'test' ],
      domain: 'example',
      topLevelDomains: [ 'com' ] }
    

    Why?

    Depending on the use case and volume I strongly recommend against solving this problem yourself using regex or other string manipulation means. The core of this problem is that you need to know all the gtld and cctld suffixes to properly parse url strings into domain and subdomains, these suffixes are regularly updated. This is a solved problem and not one you want to solve yourself (unless you are google or something). Unless you need the hostname or domain name in a pinch don't try and parse your way out of this one.

提交回复
热议问题