I have the following problem. I am trying to replace german umlauts like ä, ö, ü in java. But it simply does not work. Her
I've just tried to run it and it runs fine.
If you're not using regular expressions then i'd use string.replace
rather than string.replaceAll
as it's slightly quicker than the latter. The difference between them mainly being that replaceAll can handle regex's.
EDIT: Just noticed people in the comments have the said the same before me so if you've read theres you can pretty much ignore what I said, as stated the problem exists elsewhere in your code as that snippet works as expected.
First there is a tiny issue in Unicode:
ä
might be one code point SMALL_LETTER_A_WITH_UMLAUT or
two code points: SMALL_LETTER_A followed by COMBINING_DIACRITICAL_MARK_UMLAUT.For this one may normalize the Unicode text.
s = Normalizer.normalize(s, Normalizer.Form.NFKC);
The C
means compose, and would yield the compact version.
The second, more prozaic, problem is, that the encoding of the java source in the editor must be the same as used for the javac -encoding ...
compiler.
You can test whether the encoding is correct by using (test-wise) the u-escaping:
"\u00E4" // instead of ä
My guess is, that this might be the problem. The international norm seems to have become using UTF-8 for java sources and compilation.
Furthermore you can use
result = result.replace(UMLAUT_REPLACEMENTS[i][0], UMLAUT_REPLACEMENTS[i][1]);
without regex replace, being faster.