Converting Java String to ascii

后端 未结 2 1133
暗喜
暗喜 2020-12-08 08:01

I need to convert Strings that consists of some letters specific to certain languages (like HÄSTDJUR - note Ä) to a String without those special le

相关标签:
2条回答
  • 2020-12-08 08:45

    I think your question is the same as this one:

    Java - getting rid of accents and converting them to regular letters

    and hence the answer is also the same:

    Solution

    String convertedString = 
           Normalizer
               .normalize(input, Normalizer.Form.NFD)
               .replaceAll("[^\\p{ASCII}]", "");
    

    References

    See

    • JavaDoc: Normalizer.normalize(String, Normalizer.Form)
    • JavaDoc: Normalizer.Form.NFD
    • Sun Java Tutorial: Normalizer's API

    Example Code:

    final String input = "Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ";
    System.out.println(
        Normalizer
            .normalize(input, Normalizer.Form.NFD)
            .replaceAll("[^\\p{ASCII}]", "")
    );
    

    Output:

    This is a funky String

    0 讨论(0)
  • 2020-12-08 08:57

    I'd suggest a mapping, of special characters, to the ones you want.

    Ä --> A
    é --> e
    A --> A (exactly the same)
    etc...
    

    And then you can just call your mapping over your text (in pseudocode):

    for letter in string:
       newString += map(letter)
    

    Effectively, you need to create a set of rules for what character maps to the ASCII equivalent.

    0 讨论(0)
提交回复
热议问题