Java - removing strange characters from a String

前端 未结 11 692
轮回少年
轮回少年 2020-12-10 03:04

How do I remove strange and unwanted Unicode characters (such as a black diamond with question mark) from a String?

Updated:

Please tell me the Unicode chara

11条回答
  •  悲&欢浪女
    2020-12-10 03:13

    Most probably the text that you got was encoded in something other than UTF-8. What you could do is to not allow text with other encodings (for example Latin-1) to be uploaded:

    try {
    
      CharsetDecoder charsetDecoder = StandardCharsets.UTF_8.newDecoder();
      charsetDecoder.onMalformedInput(CodingErrorAction.REPORT);
    
      return IOUtils.toString(new InputStreamReader(new FileInputStream(filePath), charsetDecoder));
    }
    catch (MalformedInputException e) {
      // throw an exception saying the file was not saved with UTF-8 encoding.
    }
    

提交回复
热议问题