How to extract domain name from url?

前端 未结 12 1643
走了就别回头了
走了就别回头了 2020-12-13 05:46

How do I extract the domain name from a url using bash? like: http://example.com/ to example.com must work for any tld, not just .com

相关标签:
12条回答
  • 2020-12-13 06:26

    You can use simple AWK way to extract the domain name as follows:

    echo http://example.com/index.php | awk -F[/:] '{print $4}'
    

    OUTPUT: example.com

    :-)

    0 讨论(0)
  • 2020-12-13 06:33

    One solution that would cover for more cases would be based on sed regexps:

    echo http://example.com/index.php | sed -e 's#^https://\|^http://##' -e 's#:.*##' -e 's#/.*##'

    That would work for URLs like: http://example.com/index.php, http://example.com:4040/index.php, https://example.com/index.php

    0 讨论(0)
  • 2020-12-13 06:33

    Here's the node.js way, it works with or without ports and deep paths:

    //get-hostname.js
    'use strict';
    
    const url = require('url');
    const parts = url.parse(process.argv[2]);
    
    console.log(parts.hostname);
    

    Can be called like:

    node get-hostname.js http://foo.example.com:8080/test/1/2/3.html
    //foo.example.com
    

    Docs: https://nodejs.org/api/url.html

    0 讨论(0)
  • 2020-12-13 06:35
    sed -E -e 's_.*://([^/@]*@)?([^/:]+).*_\2_'
    

    e.g.

    $ sed -E -e 's_.*://([^/@]*@)?([^/:]+).*_\2_' <<< 'http://example.com'
    example.com
    
    $ sed -E -e 's_.*://([^/@]*@)?([^/:]+).*_\2_' <<< 'https://example.com'
    example.com
    
    $ sed -E -e 's_.*://([^/@]*@)?([^/:]+).*_\2_' <<< 'http://example.com:1234/some/path'
    example.com
    
    $ sed -E -e 's_.*://([^/@]*@)?([^/:]+).*_\2_' <<< 'http://user:pass@example.com:1234/some/path'
    example.com
    
    $ sed -E -e 's_.*://([^/@]*@)?([^/:]+).*_\2_' <<< 'http://user:pass@example.com:1234/some/path#fragment'
    example.com
    
    $ sed -E -e 's_.*://([^/@]*@)?([^/:]+).*_\2_' <<< 'http://user:pass@example.com:1234/some/path#fragment?params=true'
    example.com
    
    0 讨论(0)
  • 2020-12-13 06:35

    there is so little info on how you get those urls...please show more info next time. are there parameters in the url etc etc... Meanwhile, just simple string manipulation for your sample url

    eg

    $ s="http://example.com/index.php"
    $ echo ${s/%/*}  #get rid of last "/" onwards
    http://example.com
    $ s=${s/%\//}  
    $ echo ${s/#http:\/\//} # get rid of http://
    example.com
    

    other ways, using sed(GNU)

    $ echo $s | sed 's/http:\/\///;s|\/.*||'
    example.com
    

    use awk

    $ echo $s| awk '{gsub("http://|/.*","")}1'
    example.com
    
    0 讨论(0)
  • 2020-12-13 06:39

    With Ruby you can use the Domainatrix library / gem

    http://www.pauldix.net/2009/12/parse-domains-from-urls-easily-with-domainatrix.html

    require 'rubygems'
    require 'domainatrix'
    s = 'http://www.champa.kku.ac.th/dir1/dir2/file?option1&option2'
    url = Domainatrix.parse(s)
    url.domain
    => "kku"
    

    great tool! :-)

    0 讨论(0)
提交回复
热议问题