发表新帖

发表新帖

Remove “empty” character from String

后端未结

关注

 9  1969

不思量自难忘°

I\'m using a framwork which returns malformed Strings with \"empty\" characters from time to time.

\"foobar\" for example is represented by: [,f,o,o,b,a,r]

T

相关标签:

9条回答

隐瞒了意图╮

2020-12-14 11:39
Regex would be an appropriate way to sanitize the string from unwanted Unicode characters in this case.
```
String sanitized = dirty.replaceAll("[\uFEFF-\uFFFF]", ""); 
```
This will replace all char in \uFEFF-\uFFFF range with the empty string.

The [...] construct is called a character class, e.g. [aeiou] matches one of any of the lowercase vowels, [^aeiou] matches anything but.

You can do one of these two approaches:
- replaceAll("[_blacklist]", "")
- replaceAll("[^_whitelist]", "")
References
- regular-expressions.info
0 讨论(0)
发布评论:

提交评论
- 加载中...
暗喜

2020-12-14 11:41

trim left or right removes white spaces. does it has a colon before space?

even more: a=(long) string[0]; will show u the char code, and u can use replace() or substring.

0 讨论(0)
发布评论:

提交评论
- 加载中...
不知归路

2020-12-14 11:42
Thank you Johannes Rössel. It actually was '\uFEFF'

The following code works:
```
 final StringBuilder sb = new StringBuilder();
    for (final char character : body.toCharArray()) {
       if (character != '\uFEFF') {
          sb.append(character);
       }
     }  
 final String sanitzedString = sb.toString();
```
Anyone know of a way to just include a range of valid characters instead of excluding 95% of the UTF8 range?
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2

热议问题