How to remove high-ASCII characters from string like ®, ©, ™ in Java

后端 未结 4 1848
庸人自扰
庸人自扰 2020-12-06 05:52

I want to detect and remove high-ASCII characters like ®, ©, ™ from a String in Java. Is there any open-source library that can do this?

4条回答
  •  再見小時候
    2020-12-06 06:53

    I understand that you need to delete: ç,ã,Ã , but for everybody that need to convert ç,ã,Ã ---> c,a,A please have a look at this piece of code:

    Example Code:

    final String input = "Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ";
    System.out.println(
        Normalizer
            .normalize(input, Normalizer.Form.NFD)
            .replaceAll("[^\\p{ASCII}]", "")
    );
    

    Output:

    This is a funky String

提交回复
热议问题