I have a String variable (basically an English sentence with an unspecified number of numbers) and I\'d like to extract all the numbers into an array of integers. I was wond
Fraction and grouping characters for representing real numbers may differ between languages. The same real number could be written in very different ways depending on the language.
The number two million in German
2,000,000.00
and in English
2.000.000,00
A method to fully extract real numbers from a given string in a language agnostic way:
public List extractDecimals(final String s, final char fraction, final char grouping) {
List decimals = new ArrayList();
//Remove grouping character for easier regexp extraction
StringBuilder noGrouping = new StringBuilder();
int i = 0;
while(i >= 0 && i < s.length()) {
char c = s.charAt(i);
if(c == grouping) {
int prev = i-1, next = i+1;
boolean isValidGroupingChar =
prev >= 0 && Character.isDigit(s.charAt(prev)) &&
next < s.length() && Character.isDigit(s.charAt(next));
if(!isValidGroupingChar)
noGrouping.append(c);
i++;
} else {
noGrouping.append(c);
i++;
}
}
//the '.' character has to be escaped in regular expressions
String fractionRegex = fraction == POINT ? "\\." : String.valueOf(fraction);
Pattern p = Pattern.compile("-?(\\d+" + fractionRegex + "\\d+|\\d+)");
Matcher m = p.matcher(noGrouping);
while (m.find()) {
String match = m.group().replace(COMMA, POINT);
decimals.add(new BigDecimal(match));
}
return decimals;
}