Parse URL in shell script

后端 未结 14 1776
一生所求
一生所求 2020-12-02 20:55

I have url like:

sftp://user@host.net/some/random/path

I want to extract user, host and path from this string. Any part can be random lengt

14条回答
  •  一个人的身影
    2020-12-02 21:40

    [EDIT 2019] This answer is not meant to be a catch-all, works for everything solution it was intended to provide a simple alternative to the python based version and it ended up having more features than the original.


    It answered the basic question in a bash-only way and then was modified multiple times by myself to include a hand full of demands by commenters. I think at this point however adding even more complexity would make it unmaintainable. I know not all things are straight forward (checking for a valid port for example requires comparing hostport and host) but I would rather not add even more complexity.


    [Original answer]

    Assuming your URL is passed as first parameter to the script:

    #!/bin/bash
    
    # extract the protocol
    proto="$(echo $1 | grep :// | sed -e's,^\(.*://\).*,\1,g')"
    # remove the protocol
    url="$(echo ${1/$proto/})"
    # extract the user (if any)
    user="$(echo $url | grep @ | cut -d@ -f1)"
    # extract the host and port
    hostport="$(echo ${url/$user@/} | cut -d/ -f1)"
    # by request host without port    
    host="$(echo $hostport | sed -e 's,:.*,,g')"
    # by request - try to extract the port
    port="$(echo $hostport | sed -e 's,^.*:,:,g' -e 's,.*:\([0-9]*\).*,\1,g' -e 's,[^0-9],,g')"
    # extract the path (if any)
    path="$(echo $url | grep / | cut -d/ -f2-)"
    
    echo "url: $url"
    echo "  proto: $proto"
    echo "  user: $user"
    echo "  host: $host"
    echo "  port: $port"
    echo "  path: $path"
    

    I must admit this is not the cleanest solution but it doesn't rely on another scripting language like perl or python. (Providing a solution using one of them would produce cleaner results ;) )

    Using your example the results are:

    url: user@host.net/some/random/path
      proto: sftp://
      user: user
      host: host.net
      port:
      path: some/random/path
    

    This will also work for URLs without a protocol/username or path. In this case the respective variable will contain an empty string.

    [EDIT]
    If your bash version won't cope with the substitutions (${1/$proto/}) try this:

    #!/bin/bash
    
    # extract the protocol
    proto="$(echo $1 | grep :// | sed -e's,^\(.*://\).*,\1,g')"
    
    # remove the protocol -- updated
    url=$(echo $1 | sed -e s,$proto,,g)
    
    # extract the user (if any)
    user="$(echo $url | grep @ | cut -d@ -f1)"
    
    # extract the host and port -- updated
    hostport=$(echo $url | sed -e s,$user@,,g | cut -d/ -f1)
    
    # by request host without port
    host="$(echo $hostport | sed -e 's,:.*,,g')"
    # by request - try to extract the port
    port="$(echo $hostport | sed -e 's,^.*:,:,g' -e 's,.*:\([0-9]*\).*,\1,g' -e 's,[^0-9],,g')"
    
    # extract the path (if any)
    path="$(echo $url | grep / | cut -d/ -f2-)"
    

提交回复
热议问题