java replace German umlauts

后端 未结 8 783
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-31 06:24

I have the following problem. I am trying to replace german umlauts like ä, ö, ü in java. But it simply does not work. Her

相关标签:
8条回答
  • 2020-12-31 07:17

    I've just tried to run it and it runs fine.

    If you're not using regular expressions then i'd use string.replace rather than string.replaceAll as it's slightly quicker than the latter. The difference between them mainly being that replaceAll can handle regex's.

    EDIT: Just noticed people in the comments have the said the same before me so if you've read theres you can pretty much ignore what I said, as stated the problem exists elsewhere in your code as that snippet works as expected.

    0 讨论(0)
  • 2020-12-31 07:18

    First there is a tiny issue in Unicode:

    • ä might be one code point SMALL_LETTER_A_WITH_UMLAUT or two code points: SMALL_LETTER_A followed by COMBINING_DIACRITICAL_MARK_UMLAUT.

    For this one may normalize the Unicode text.

    s = Normalizer.normalize(s, Normalizer.Form.NFKC);
    

    The C means compose, and would yield the compact version.

    The second, more prozaic, problem is, that the encoding of the java source in the editor must be the same as used for the javac -encoding ... compiler.

    You can test whether the encoding is correct by using (test-wise) the u-escaping:

    "\u00E4" // instead of ä
    

    My guess is, that this might be the problem. The international norm seems to have become using UTF-8 for java sources and compilation.

    Furthermore you can use

        result = result.replace(UMLAUT_REPLACEMENTS[i][0], UMLAUT_REPLACEMENTS[i][1]);
    

    without regex replace, being faster.

    0 讨论(0)
提交回复
热议问题