Remove non-ASCII non-printable characters from a String

后端 未结 6 2291
轻奢々
轻奢々 2020-12-02 13:56

I get user input including non-ASCII characters and non-printable characters, such as

\\xc2d
\\xa0
\\xe7
\\xc3\\ufffdd
\\xc3\\ufffdd
\\xc2\\xa0
\\xc3\\xa7
         


        
6条回答
  •  抹茶落季
    2020-12-02 14:37

    Input => "This \u7279text \u7279is what I need" Output => "This text is what I need"

    If you are trying to remove Unicode characters from a string like above this code will work

    Pattern unicodeCharsPattern = Pattern.compile("\\\\u(\\p{XDigit}{4})");
    Matcher unicodeMatcher = unicodeChars.matcher(data);
    String cleanData = null;
    if (unicodeMatcher.find()) {
        cleanData = unicodeMatcher.replaceAll("");
    }
    

提交回复
热议问题