How to extract numbers from a string?

那年仲夏 提交于 2019-12-03 13:24:55

You can use tr to delete all of the non-digit characters, like so:

echo toto.titi.12.tata.2.abc.def | tr -d -c 0-9
cchamberlain

To extract all the individual numbers and print one number word per line pipe through -

tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g'

Breakdown:

  • Replaces all line breaks with spaces: tr '\n' ' '
  • Replaces all non numbers with spaces: sed -e 's/[^0-9]/ /g'
  • Remove leading white space: -e 's/^ *//g'
  • Remove trailing white space: -e 's/ *$//g'
  • Squeeze spaces in sequence to 1 space: tr -s ' '
  • Replace remaining space separators with line break: sed 's/ /\n/g'

Example:

echo -e " this 20 is 2sen\nten324ce 2 sort of" | tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g'

Will print out

20
2
324
2

Parameter expansion would seem to be the order of the day.

$ string="toto.titi.12.tata.2.abc.def"
$ read num1 num2 <<<${string//[^0-9]/ }
$ echo "$num1 / $num2"
12 / 2

This of course depends on the format of $string. But at least for the example you've provided, it seems to work.

This may be superior to anubhava's awk solution which requires a subshell. I also like chepner's solution, but regular expressions are "heavier" than parameter expansion (though obviously way more precise). (Note that in the expression above, [^0-9] may look like a regex atom, but it is not.)

You can read about this form or Parameter Expansion in the bash man page. Note that ${string//this/that} (as well as the <<<) is a bashism, and is not compatible with traditional Bourne or posix shells.

This would be easier to answer if you provided exactly the output you're looking to get. If you mean you want to get just the digits out of the string, and remove everything else, you can do this:

d@AirBox:~$ string="toto.titi.12.tata.2.abc.def"
d@AirBox:~$ echo "${string//[a-z,.]/}"
122

If you clarify a bit I may be able to help more.

Using awk:

arr=( $(echo $string | awk -F "." '{print $3, $5}') )
num1=${arr[0]}
num2=${arr[1]}
jderefinko

You can also use sed:

echo "toto.titi.12.tata.2.abc.def" | sed 's/[0-9]*//g'

Here, sed replaces

  • any digits (class [0-9])
  • repeated any number of times (*)
  • with nothing (nothing between the second and third /),
  • and g stands for globally.

Output will be:

toto.titi..tata..abc.def

Use regular expression matching:

string="toto.titi.12.tata.2.abc.def"
[[ $string =~ toto\.titi\.([0-9]+)\.tata\.([0-9]+)\. ]]
# BASH_REMATCH[0] would be "toto.titi.12.tata.2.", the entire match
# Successive elements of the array correspond to the parenthesized
# subexpressions, in left-to-right order. (If there are nested parentheses,
# they are numbered in depth-first order.)
first_number=${BASH_REMATCH[1]}
second_number=${BASH_REMATCH[2]}

Hi adding yet another way to do this using 'cut',

echo $string | cut -d'.' -f3,5 | tr '.' ' '

This gives you the following output: 12 2

Here is a short one:

string="toto.titi.12.tata.2.abc.def"
id=$(echo "$string" | grep -o -E '[0-9]+')

echo $id // => output: 12 2

with space between the numbers. Hope it helps...

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!