发表新帖

发表新帖

Extract substring in Bash

前端未结

关注

 22  2199

别那么骄傲 2020-11-22 11:02

Given a filename in the form someletters_12345_moreleters.ext, I want to extract the 5 digits and put them into a variable.

So to emphasize the point, I

22条回答

旧时难觅i (楼主)

2020-11-22 11:33
If we focus in the concept of:
"A run of (one or several) digits"

We could use several external tools to extract the numbers.
We could quite easily erase all other characters, either sed or tr:
```
name='someletters_12345_moreleters.ext'

echo $name | sed 's/[^0-9]*//g'    # 12345
echo $name | tr -c -d 0-9          # 12345
```
But if $name contains several runs of numbers, the above will fail:

If "name=someletters_12345_moreleters_323_end.ext", then:
```
echo $name | sed 's/[^0-9]*//g'    # 12345323
echo $name | tr -c -d 0-9          # 12345323
```
We need to use regular expresions (regex).
To select only the first run (12345 not 323) in sed and perl:
```
echo $name | sed 's/[^0-9]*\([0-9]\{1,\}\).*$/\1/'
perl -e 'my $name='$name';my ($num)=$name=~/(\d+)/;print "$num\n";'
```
But we could as well do it directly in bash⁽¹⁾ :
```
regex=[^0-9]*([0-9]{1,}).*$; \
[[ $name =~ $regex ]] && echo ${BASH_REMATCH[1]}
```
This allows us to extract the FIRST run of digits of any length
surrounded by any other text/characters.

Note: regex=[^0-9]*([0-9]{5,5}).*$; will match only exactly 5 digit runs. :-)

⁽¹⁾: faster than calling an external tool for each short texts. Not faster than doing all processing inside sed or awk for large files.
0 讨论(0)

查看其它22个回答
发布评论:

提交评论
- 加载中...

热议问题