How many spaces will Java String.trim() remove?

帅比萌擦擦* 提交于 2019-12-02 16:26:51
LukeH

All of them.

Returns: A copy of this string with leading and trailing white space removed, or this string if it has no leading or trailing white space.

~ Quoted from Java 1.5.0 docs

(But why didn't you just try it and see for yourself?)

From the source code (decompiled) :

  public String trim()
  {
    int i = this.count;
    int j = 0;
    int k = this.offset;
    char[] arrayOfChar = this.value;
    while ((j < i) && (arrayOfChar[(k + j)] <= ' '))
      ++j;
    while ((j < i) && (arrayOfChar[(k + i - 1)] <= ' '))
      --i;
    return (((j > 0) || (i < this.count)) ? substring(j, i) : this);
  }

The two while that you can see mean all the characters whose unicode is below the space character's, at beginning and end, are removed.

dfa

When in doubt, write a unit test:

@Test
public void trimRemoveAllBlanks(){
    assertThat("    content   ".trim(), is("content"));
}

NB: of course the test (for JUnit + Hamcrest) doesn't fails

One thing to point out, though, is that String.trim has a peculiar definition of "whitespace". It does not remove Unicode whitespace, but also removes ASCII control characters that you may not consider whitespace.

This method may be used to trim whitespace from the beginning and end of a string; in fact, it trims all ASCII control characters as well.

If possible, you may want to use Commons Lang's StringUtils.strip(), which also handles Unicode whitespace (and is null-safe, too).

See API for String class:

Returns a copy of the string, with leading and trailing whitespace omitted.

Whitespace on both sides is removed:

Note that trim() does not change the String instance, it will return a new object:

 String original = "  content  ";
 String withoutWhitespace = original.trim();

 // original still refers to "  content  "
 // and withoutWhitespace refers to "content"
Britc

Based on the Java docs here, the .trim() replaces '\u0020' which is commonly known as whitespace.

But take note, the '\u00A0' (Unicode NO-BREAK SPACE &nbsp; ) is also seen as a whitespace, and .trim() will NOT remove this. This is especially common in HTML.

To remove it, I use :

tmpTrimStr = tmpTrimStr.replaceAll("\\u00A0", "");

An example of this problem was discussed here.

Unmesha SreeVeni

Example of Java trim() removing spaces:

public class Test
{
    public static void main(String[] args)
    {
        String str = "\n\t This is be trimmed.\n\n";

        String newStr = str.trim();     //removes newlines, tabs and spaces.

        System.out.println("old = " + str);
        System.out.println("new = " + newStr);
    }
}

OUTPUT

old = 
 This is a String.


new = This is a String.

From java docs(String class source),

/**
 * Returns a copy of the string, with leading and trailing whitespace
 * omitted.
 * <p>
 * If this <code>String</code> object represents an empty character
 * sequence, or the first and last characters of character sequence
 * represented by this <code>String</code> object both have codes
 * greater than <code>'&#92;u0020'</code> (the space character), then a
 * reference to this <code>String</code> object is returned.
 * <p>
 * Otherwise, if there is no character with a code greater than
 * <code>'&#92;u0020'</code> in the string, then a new
 * <code>String</code> object representing an empty string is created
 * and returned.
 * <p>
 * Otherwise, let <i>k</i> be the index of the first character in the
 * string whose code is greater than <code>'&#92;u0020'</code>, and let
 * <i>m</i> be the index of the last character in the string whose code
 * is greater than <code>'&#92;u0020'</code>. A new <code>String</code>
 * object is created, representing the substring of this string that
 * begins with the character at index <i>k</i> and ends with the
 * character at index <i>m</i>-that is, the result of
 * <code>this.substring(<i>k</i>,&nbsp;<i>m</i>+1)</code>.
 * <p>
 * This method may be used to trim whitespace (as defined above) from
 * the beginning and end of a string.
 *
 * @return  A copy of this string with leading and trailing white
 *          space removed, or this string if it has no leading or
 *          trailing white space.
 */
public String trim() {
int len = count;
int st = 0;
int off = offset;      /* avoid getfield opcode */
char[] val = value;    /* avoid getfield opcode */

while ((st < len) && (val[off + st] <= ' ')) {
    st++;
}
while ((st < len) && (val[off + len - 1] <= ' ')) {
    len--;
}
return ((st > 0) || (len < count)) ? substring(st, len) : this;
}

Note that after getting start and length it calls the substring method of String class.

trim() will remove all leading and trailing blanks. But be aware: Your string isn't changed. trim() will return a new string instance instead.

Farah Nazifa

If your String input is:

String a = "   abc   ";
System.out.println(a);

Yes, output will be, "abc"; But if your String input is:

String b = "    This  is  a  test  "
System.out.println(b);

Output will be This is a test So trim only removes spaces before your first character and after your last character in the string and ignores the inner spaces. This is a piece of my code that slightly optimizes the built in String trim method removing the inner spaces and removes spaces before and after your first and last character in the string. Hope it helps.

public static String trim(char [] input){
    char [] output = new char [input.length];
    int j=0;
    int jj=0;
    if(input[0] == ' ' )    {
        while(input[jj] == ' ') 
            jj++;       
    }
    for(int i=jj; i<input.length; i++){
      if(input[i] !=' ' || ( i==(input.length-1) && input[input.length-1] == ' ')){
        output[j]=input[i];
        j++;
      }
      else if (input[i+1]!=' '){
        output[j]=' ';
        j++;
      }      
    }
    char [] m = new char [j];
    int a=0;
    for(int i=0; i<m.length; i++){
      m[i]=output[a];
      a++;
    }
    return new String (m);
  }

It will remove all spaces on both the sides.

rciafardone

One very important thing is that a string made entirely of "white spaces" will return a empty string.

if a string sSomething = "xxxxx", where x stand for white spaces, sSomething.trim() will return an empty string.

if a string sSomething = "xxAxx", where x stand for white spaces, sSomething.trim() will return A.

if sSomething ="xxSomethingxxxxAndSomethingxElsexxx", sSomething.trim() will return SomethingxxxxAndSomethingxElse, notice that the number of x between words is not altered.

If you want a neat packeted string combine trim() with regex as shown in this post: How to remove duplicate white spaces in string using Java?.

Order is meaningless for the result but trim() first would be more efficient. Hope it helps.

Angelos P

To keep only one instance for the String, you could use the following.

str = "  Hello   ";

or

str = str.trim();

Then the value of the str String, will be str = "Hello"

Trim() works for both sides.

Javadoc for String has all the details. Removes white space (space, tabs, etc ) from both end and returns a new string.

If you want to check what will do some method, you can use BeanShell. It is a scripting language designed to be as close to Java as possible. Generally speaking it is interpreted Java with some relaxations. Another option of this kind is Groovy language. Both these scripting languages provide convenient Read-Eval-Print loop know from interpreted languages. So you can run console and just type:

"     content     ".trim();

You'll see "content" as a result after pressing Enter (or Ctrl+R in Groovy console).

MIke
String formattedStr=unformattedStr;
formattedStr=formattedStr.trim().replaceAll("\\s+", " ");
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!