Bash: How to tokenize a string variable?

前端 未结 5 675
眼角桃花
眼角桃花 2020-12-08 13:33

If I have a string variable who\'s value is \"john is 17 years old\" how do I tokenize this using spaces as the delimeter? Would I use awk?

相关标签:
5条回答
  • 2020-12-08 14:01

    Use the shell's automatic tokenization of unquoted variables:

    $ string="john is 17 years old"
    $ for word in $string; do echo "$word"; done
    john
    is
    17
    years
    old
    

    If you want to change the delimiter you can set the $IFS variable, which stands for internal field separator. The default value of $IFS is " \t\n" (space, tab, newline).

    $ string="john_is_17_years_old"
    $ (IFS='_'; for word in $string; do echo "$word"; done)
    john
    is
    17
    years
    old
    

    (Note that in this second example I added parentheses around the second line. This creates a sub-shell so that the change to $IFS doesn't persist. You generally don't want to permanently change $IFS as it can wreak havoc on unsuspecting shell commands.)

    0 讨论(0)
  • 2020-12-08 14:04
    $ string="john is 17 years old"
    $ set -- $string
    $ echo $1
    john
    $ echo $2
    is
    $ echo $3
    17
    
    0 讨论(0)
  • 2020-12-08 14:16

    you can try something like this :

    #!/bin/bash
    n=0
    a=/home/file.txt
    for i in `cat ${a} | tr ' ' '\n'` ; do
       str=${str},${i}
       let n=$n+1
       var=`echo "var${n}"`
       echo $var is ... ${i}
    done
    
    0 讨论(0)
  • 2020-12-08 14:16

    with POSIX extended regex:

    $ str='a b     c d'
    $ echo "$str" | sed -E 's/\W+/\n/g' | hexdump -C
    00000000  61 0a 62 0a 63 0a 64 0a                           |a.b.c.d.|
    00000008
    

    this is like python's re.split(r'\W+', str)

    \W matches a non-word character,
    including space, tab, newline, return, [like the bash for tokenizer]
    but also including symbols like quotes, brackets, signs, ...

    ... except the underscore sign _,
    so snake_case is one word, but kebab-case are two words.

    leading and trailing space will create an empty line.

    0 讨论(0)
  • 2020-12-08 14:19
    $ string="john is 17 years old"
    $ tokens=( $string )
    $ echo ${tokens[*]}
    

    For other delimiters, like ';'

    $ string="john;is;17;years;old"
    $ IFS=';' tokens=( $string )
    $ echo ${tokens[*]}
    
    0 讨论(0)
提交回复
热议问题