问题
I asked a question earlier but met harsh criticism, so here I pose it again. Simpler, and rephrased to appeal to those who may have been concerned about the way I asked it before.
BACKGROUND I am parsing some HTML for information. I have isolated everything in a series of lines but the content I wish to grab and a bunch of spaces after it. To get rid of the spaces, I opted to use trim(), but I have been having trouble. The last few lines of my code are tests:
System.out.println("'" + someString + "'\n'" + someString.trim() + "'");
The results were:
'Sophomore '
'Sophomore '
I was worried I might have a problem with the way I was calling trim(), since we all make mistakes from time to time, so I tested it like this:
String s = " hello ";
System.out.println("'" + s+ "'\n'" + s.trim() + "'");
The results were:
' hello '
'hello'
MY QUESTION What am I doing wrong? What I want is to get 'Sophomore', not 'Sophomore '
I look forward to your excellent answers (thanks in advance!).
回答1:
String.trim()
specifically only removes characters before the first character whose code exceeds \u0020
, and after the last such character.
This is insufficient to remove all possible white space characters - Unicode defines several more (with code points above \u0020
) that will not be matched by .trim()
.
Perhaps your white space characters aren't the ones you think they are?
EDIT comments revealed that the extra characters were indeed "special" whitespace characters, specifically \u00a0
which is a Unicode "non-breaking space". To replace those with normal spaces, use:
str = str.replace('\u00a0', ' ');
回答2:
There must be a non-whitespace character in the source string. Add the following to your code and see what it prints.
for (char ch : someString.toCharArray()) {
System.out.print(Integer.toHexString(ch) + " ");
}
来源:https://stackoverflow.com/questions/12343765/query-about-the-trim-method-in-java