java replace German umlauts

后端 未结 8 810
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-31 06:24

I have the following problem. I am trying to replace german umlauts like ä, ö, ü in java. But it simply does not work. Her

8条回答
  •  一个人的身影
    2020-12-31 07:18

    First there is a tiny issue in Unicode:

    • ä might be one code point SMALL_LETTER_A_WITH_UMLAUT or two code points: SMALL_LETTER_A followed by COMBINING_DIACRITICAL_MARK_UMLAUT.

    For this one may normalize the Unicode text.

    s = Normalizer.normalize(s, Normalizer.Form.NFKC);
    

    The C means compose, and would yield the compact version.

    The second, more prozaic, problem is, that the encoding of the java source in the editor must be the same as used for the javac -encoding ... compiler.

    You can test whether the encoding is correct by using (test-wise) the u-escaping:

    "\u00E4" // instead of ä
    

    My guess is, that this might be the problem. The international norm seems to have become using UTF-8 for java sources and compilation.

    Furthermore you can use

        result = result.replace(UMLAUT_REPLACEMENTS[i][0], UMLAUT_REPLACEMENTS[i][1]);
    

    without regex replace, being faster.

提交回复
热议问题