How to extract domain name from url?

前端 未结 12 1644
走了就别回头了
走了就别回头了 2020-12-13 05:46

How do I extract the domain name from a url using bash? like: http://example.com/ to example.com must work for any tld, not just .com

相关标签:
12条回答
  • 2020-12-13 06:41

    Instead of using regex to do this you can use python's urlparse:

     URL=http://www.example.com
    
     python -c "from urlparse import urlparse
     url = urlparse('$URL')
     print url.netloc"
    

    You could either use it like this or put it in a small script. However this still expects a valid scheme identifier, looking at your comment your input doesn't necessarily provide one. You can specify a default scheme, but urlparse expects the netloc to start with '//' :

    url = urlparse('//www.example.com/index.html','http')

    So you will have to prepend those manually, i.e:

     python -c "from urlparse import urlparse
     if '$URL'.find('://') == -1 then:
       url = urlparse('//$URL','http')
     else:
       url = urlparse('$URL')
     print url.netloc"
    
    0 讨论(0)
  • 2020-12-13 06:43
    #!/usr/bin/perl -w
    use strict;
    
    my $url = $ARGV[0];
    
    if($url =~ /([^:]*:\/\/)?([^\/]+\.[^\/]+)/g) {
      print $2;
    }
    

    Usage:

    ./test.pl 'https://example.com'
    example.com
    
    ./test.pl 'https://www.example.com/'
    www.example.com
    
    ./test.pl 'example.org/'
    example.org
    
     ./test.pl 'example.org'
    example.org
    
    ./test.pl 'example'  -> no output
    

    And if you just want the domain and not the full host + domain use this instead:

    #!/usr/bin/perl -w
    use strict;
    
    my $url = $ARGV[0];
    if($url =~ /([^:]*:\/\/)?([^\/]*\.)*([^\/\.]+\.[^\/]+)/g) {
      print $3;
    }
    
    0 讨论(0)
  • 2020-12-13 06:49
    basename "http://example.com"
    

    Now of course, this won't work with a URI like this: http://www.example.com/index.html but you could do the following:

    basename $(dirname "http://www.example.com/index.html")
    

    Or for more complex URIs:

    echo "http://www.example.com/somedir/someotherdir/index.html" | cut -d'/' -f3
    

    -d means "delimiter" and -f means "field"; in the above example, the third field delimited by the forward slash '/' is www.example.com.

    0 讨论(0)
  • 2020-12-13 06:50
    $ URI="http://user:pw@example.com:80/"
    $ echo $URI | sed -e 's/[^/]*\/\/\([^@]*@\)\?\([^:/]*\).*/\2/'
    example.com
    

    see http://en.wikipedia.org/wiki/URI_scheme

    0 讨论(0)
  • 2020-12-13 06:50

    The following will output "example.com":

    URI="http://user@example.com/foo/bar/baz/?lala=foo" 
    ruby -ruri -e "p URI.parse('$URI').host"
    

    For more info on what you can do with Ruby's URI class you'd have to consult the docs.

    0 讨论(0)
  • 2020-12-13 06:53
    echo $URL | cut -d'/' -f3 | cut -d':' -f1
    

    Works for URLs:

    http://host.example.com
    http://host.example.com/hi/there
    http://host.example.com:2345/hi/there
    http://host.example.com:2345
    
    0 讨论(0)
提交回复
热议问题