How to split one string into multiple strings separated by at least one space in bash shell?

前端 未结 8 1320
日久生厌
日久生厌 2020-11-27 09:17

I have a string containing many words with at least one space between each two. How can I split the string into individual words so I can loop through them?

The stri

8条回答
  •  悲&欢浪女
    2020-11-27 10:01

    Probably the easiest and most secure way in BASH 3 and above is:

    var="string    to  split"
    read -ra arr <<<"$var"
    

    (where arr is the array which takes the split parts of the string) or, if there might be newlines in the input and you want more than just the first line:

    var="string    to  split"
    read -ra arr -d '' <<<"$var"
    

    (please note the space in -d ''; it cannot be omitted), but this might give you an unexpected newline from <<<"$var" (as this implicitly adds an LF at the end).

    Example:

    touch NOPE
    var="* a  *"
    read -ra arr <<<"$var"
    for a in "${arr[@]}"; do echo "[$a]"; done
    

    Outputs the expected

    [*]
    [a]
    [*]
    

    as this solution (in contrast to all previous solutions here) is not prone to unexpected and often uncontrollable shell globbing.

    Also this gives you the full power of IFS as you probably want:

    Example:

    IFS=: read -ra arr < <(grep "^$USER:" /etc/passwd)
    for a in "${arr[@]}"; do echo "[$a]"; done
    

    Outputs something like:

    [tino]
    [x]
    [1000]
    [1000]
    [Valentin Hilbig]
    [/home/tino]
    [/bin/bash]
    

    As you can see, spaces can be preserved this way, too:

    IFS=: read -ra arr <<<' split  :   this    '
    for a in "${arr[@]}"; do echo "[$a]"; done
    

    outputs

    [ split  ]
    [   this    ]
    

    Please note that the handling of IFS in BASH is a subject on its own, so do your tests; some interesting topics on this:

    • unset IFS: Ignores runs of SPC, TAB, NL and on line starts and ends
    • IFS='': No field separation, just reads everything
    • IFS=' ': Runs of SPC (and SPC only)

    Some last examples:

    var=$'\n\nthis is\n\n\na test\n\n'
    IFS=$'\n' read -ra arr -d '' <<<"$var"
    i=0; for a in "${arr[@]}"; do let i++; echo "$i [$a]"; done
    

    outputs

    1 [this is]
    2 [a test]
    

    while

    unset IFS
    var=$'\n\nthis is\n\n\na test\n\n'
    read -ra arr -d '' <<<"$var"
    i=0; for a in "${arr[@]}"; do let i++; echo "$i [$a]"; done
    

    outputs

    1 [this]
    2 [is]
    3 [a]
    4 [test]
    

    BTW:

    • If you are not used to $'ANSI-ESCAPED-STRING' get used to it; it's a timesaver.

    • If you do not include -r (like in read -a arr <<<"$var") then read does backslash escapes. This is left as exercise for the reader.


    For the second question:

    To test for something in a string I usually stick to case, as this can check for multiple cases at once (note: case only executes the first match, if you need fallthrough use multiple case statements), and this need is quite often the case (pun intended):

    case "$var" in
    '')                empty_var;;                # variable is empty
    *' '*)             have_space "$var";;        # have SPC
    *[[:space:]]*)     have_whitespace "$var";;   # have whitespaces like TAB
    *[^-+.,A-Za-z0-9]*) have_nonalnum "$var";;    # non-alphanum-chars found
    *[-+.,]*)          have_punctuation "$var";;  # some punctuation chars found
    *)                 default_case "$var";;      # if all above does not match
    esac
    

    So you can set the return value to check for SPC like this:

    case "$var" in (*' '*) true;; (*) false;; esac
    

    Why case? Because it usually is a bit more readable than regex sequences, and thanks to Shell metacharacters it handles 99% of all needs very well.

提交回复
热议问题