I have a string in a Bash shell script that I want to split into an array of characters, not based on a delimiter but just one character per array index. How can I do this? Ideally it would not use any external programs. Let me rephrase that. My goal is portability, so things like sed
that are likely to be on any POSIX compatible system are fine.
问题:
回答1:
Try
echo "abcdefg" | fold -w1
Edit: Added a more elegant solution suggested in comments.
echo "abcdefg" | grep -o .
回答2:
You can access each letter individually already without an array conversion:
$ foo="bar" $ echo ${foo:0:1} b $ echo ${foo:1:1} a $ echo ${foo:2:1} r
If that's not enough, you could use something like this:
$ bar=($(echo $foo|sed 's/\(.\)/\1 /g')) $ echo ${bar[1]} a
If you can't even use sed
or something like that, you can use the first technique above combined with a while loop using the original string's length (${#foo}
) to build the array.
Warning: the code below does not work if the string contains whitespace. I think Vaughn Cato's answer has a better chance at surviving with special chars.
thing=($(i=0; while [ $i -lt ${#foo} ] ; do echo ${foo:$i:1} ; i=$((i+1)) ; done))
回答3:
If your string is stored in variable x, this produces an array y with the individual characters:
i=0 while [ $i -lt ${#x} ]; do y[$i]=${x:$i:1}; i=$((i+1));done
回答4:
As an alternative to iterating over 0 .. ${#string}-1
with a for/while loop, there are two other ways I can think of to do this with only bash: using =~
and using printf
. (There's a third possibility using eval
and a {..}
sequence expression, but this lacks clarity.)
With the correct environment and NLS enabled in bash these will work with non-ASCII as hoped, removing potential sources of failure with older system tools such as sed
, if that's a concern. These will work from bash-3.0 (released 2005).
Using =~
and regular expressions, converting a string to an array in a single expression:
string="wonkabars" [[ "$string" =~ ${string//?/(.)} ]] # splits into array printf "%s\n" "${BASH_REMATCH[@]:1}" # loop free: reuse fmtstr declare -a arr=( "${BASH_REMATCH[@]:1}" ) # copy array for later
The way this works is to perform an expansion of string
which substitutes each single character for (.)
, then match this generated regular expression with grouping to capture each individual character into BASH_REMATCH[]
. Index 0 is set to the entire string, since that special array is read-only you cannot remove it, note the :1
when the array is expanded to skip over index 0, if needed. Some quick testing for non-trivial strings (>64 chars) shows this method is substantially faster than one using bash string and array operations.
The above will work with strings containing newlines, =~
supports POSIX ERE where .
matches anything except NUL by default, i.e. the regex is compiled without REG_NEWLINE
. (The behaviour of POSIX text processing utilities is allowed to be different by default in this respect, and usually is.)
Second option, using printf
:
string="wonkabars" ii=0 while printf "%s%n" "${string:ii++:1}" xx; do ((xx)) && printf "\n" || break done
This loop increments index ii
to print one character at a time, and breaks out when there are no characters left. This would be even simpler if the bash printf
returned the number of character printed (as in C) rather than an error status, instead the number of characters printed is captured in xx
using %n
. (This works at least back as far as bash-2.05b.)
With bash-3.1 and printf -v var
you have slightly more flexibility, and can avoid falling off the end of the string should you be doing something other than printing the characters, e.g. to create an array:
declare -a arr ii=0 while printf -v cc "%s%n" "${string:(ii++):1}" xx; do ((xx)) && arr+=("$cc") || break done
回答5:
The most simple, complete and elegant solution:
$ read -a ARRAY
and test
$ echo ${ARRAY[0]} a $ echo ${ARRAY[1]} b
Explanation: read -a
reads the stdin as an array and assigns it to the variable ARRAY treating spaces as delimiter for each array item.
The evaluation of echoing the string to sed just add needed spaces between each character.
We are using Here String (
回答6:
If the text can contain spaces:
eval a=( $(echo "this is a test" | sed "s/\(.\)/'\1' /g") )
回答7:
$ echo hello | awk NF=NF FS= h e l l o
Or
$ echo hello | awk '$0=RT' RS=[[:alnum:]] h e l l o
回答8:
string=hello123 for i in $(seq 0 ${#string}) do array[$i]=${string:$i:1} done echo "zero element of array is [${array[0]}]" echo "entire array is [${array[@]}]"
The zero element of array is [h]
. The entire array is [h e l l o 1 2 3 ]
.
回答9:
If you want to store this in an array, you can do this:
string=foo unset chars declare -a chars while read -N 1 do chars[${#chars[@]}]="$REPLY" done
The final x
is necessary to handle the fact that a newline is appended after $string
if it doesn't contain one.
If you want to use NUL-separated characters, you can try this:
echo -n "$string" | while read -N 1 do printf %s "$REPLY" printf '\0' done
回答10:
For those who landed here searching how to do this in fish:
We can use the builtin string
command (since v2.3.0) for string manipulation.
The output is a list, so array operations will work.
Here's a more complex example iterating over the string with an index.
回答11:
AWK is quite convenient:
a='123'; echo $a | awk 'BEGIN{FS="";OFS=" "} {print $1,$2,$3}'
where FS
and OFS
is delimiter for read-in and print-out