Extract substring in Bash

前端 未结 22 2069
别那么骄傲
别那么骄傲 2020-11-22 11:02

Given a filename in the form someletters_12345_moreleters.ext, I want to extract the 5 digits and put them into a variable.

So to emphasize the point, I

22条回答
  •  旧时难觅i
    2020-11-22 11:33

    If we focus in the concept of:
    "A run of (one or several) digits"

    We could use several external tools to extract the numbers.
    We could quite easily erase all other characters, either sed or tr:

    name='someletters_12345_moreleters.ext'
    
    echo $name | sed 's/[^0-9]*//g'    # 12345
    echo $name | tr -c -d 0-9          # 12345
    

    But if $name contains several runs of numbers, the above will fail:

    If "name=someletters_12345_moreleters_323_end.ext", then:

    echo $name | sed 's/[^0-9]*//g'    # 12345323
    echo $name | tr -c -d 0-9          # 12345323
    

    We need to use regular expresions (regex).
    To select only the first run (12345 not 323) in sed and perl:

    echo $name | sed 's/[^0-9]*\([0-9]\{1,\}\).*$/\1/'
    perl -e 'my $name='$name';my ($num)=$name=~/(\d+)/;print "$num\n";'
    

    But we could as well do it directly in bash(1) :

    regex=[^0-9]*([0-9]{1,}).*$; \
    [[ $name =~ $regex ]] && echo ${BASH_REMATCH[1]}
    

    This allows us to extract the FIRST run of digits of any length
    surrounded by any other text/characters.

    Note: regex=[^0-9]*([0-9]{5,5}).*$; will match only exactly 5 digit runs. :-)

    (1): faster than calling an external tool for each short texts. Not faster than doing all processing inside sed or awk for large files.

提交回复
热议问题