How to trim the whitespace from a string? [duplicate]

China☆狼群 提交于 2019-11-26 20:53:43

The simple way to trim leading and trailing whitespace is to call String.trim(). If you just want to trim just leading and trailing spaces (rather than all leading and trailing whitespace), there is an Apache commons method called StringUtils.strip(String, String) that can do this; call it with " " as the 2nd argument.

Your attempted code has a number of bugs, and is fundamentally inefficient. If you really want to implement this yourself, then you should:

  1. count the leading space characters
  2. count the trailing space characters
  3. if either count is non-zero, call String.substring(from, end) to create a new string containing the characters you want to keep.

This approach avoids copying any characters1.


1 - Actually, that depends on the implementation of String. For some implementations there will be no copying, for others a single copy is made. But either is an improvement on your approach, which entails a minimum of 2 copies, and more if there are any characters to trim.

String.trim() is very old, at least to java 1.3. You don't have this?

Apache StringUtils.strip is the best answer here that works with all expected white space characters (not just space), and can be downloaded here:

Here's the relevant code ripped from this source file to implement it in your own class if you wanted, but really, just download and use StringUtils to get more bang for your buck! Note that you can use StringUtils.stripStart to trim any leading character from a java string as well.

public static final int INDEX_NOT_FOUND = -1

public static String strip(final String str) {
    return strip(str, null);
}

public static String stripStart(final String str, final String stripChars) {
    int strLen;
    if (str == null || (strLen = str.length()) == 0) {
        return str;
    }
    int start = 0;
    if (stripChars == null) {
        while (start != strLen && Character.isWhitespace(str.charAt(start))) {
            start++;
        }
    } else if (stripChars.isEmpty()) {
        return str;
    } else {
        while (start != strLen && stripChars.indexOf(str.charAt(start)) != INDEX_NOT_FOUND) {
            start++;
        }
    }
    return str.substring(start);
}

public static String stripEnd(final String str, final String stripChars) {
    int end;
    if (str == null || (end = str.length()) == 0) {
        return str;
    }

    if (stripChars == null) {
        while (end != 0 && Character.isWhitespace(str.charAt(end - 1))) {
            end--;
        }
    } else if (stripChars.isEmpty()) {
        return str;
    } else {
        while (end != 0 && stripChars.indexOf(str.charAt(end - 1)) != INDEX_NOT_FOUND) {
            end--;
        }
    }
    return str.substring(0, end);
}

public static String strip(String str, final String stripChars) {
    if (isEmpty(str)) {
        return str;
    }
    str = stripStart(str, stripChars);
    return stripEnd(str, stripChars);
}

First of all, what others said about String.trim(). Really, don't reinvent the wheel.

But for the record, what's going wrong with your code is that Java arrays aren't resizeable. When you initially set up your target array, you create it as a size 0 array. You then tell System.arraycopy to stuff len - 1 characters in there. That's not going to work. If you wanted it to work, you'd need to set up the array as:

char[] newChars = new char[len - 1];

But that's amazingly inefficient, reallocating and copying a new array each time through the loop. Use the three steps that Stephen C mentioned, ending with a substring.

With JDK/11, now you can make use of the String.strip API to return a string whose value is this string, with all leading and trailing whitespace removed. The javadoc for the same is :

/**
 * Returns a string whose value is this string, with all leading
 * and trailing {@link Character#isWhitespace(int) white space}
 * removed.
 * <p>
 * If this {@code String} object represents an empty string,
 * or if all code points in this string are
 * {@link Character#isWhitespace(int) white space}, then an empty string
 * is returned.
 * <p>
 * Otherwise, returns a substring of this string beginning with the first
 * code point that is not a {@link Character#isWhitespace(int) white space}
 * up to and including the last code point that is not a
 * {@link Character#isWhitespace(int) white space}.
 * <p>
 * This method may be used to strip
 * {@link Character#isWhitespace(int) white space} from
 * the beginning and end of a string.
 *
 * @return  a string whose value is this string, with all leading
 *          and trailing white space removed
 *
 * @see Character#isWhitespace(int)
 *
 * @since 11
 */
public String strip()

The sample cases for these could be:--

System.out.println("".strip());
System.out.println("  both  ".strip());
System.out.println("  leading".strip());
System.out.println("trailing  ".strip());

If you don't want to use String.trim() method, then it can be implemented like below. The logic will handle different scenarios like space, tab and other special characters.

public static String trim(String str){
    int i=0;
    int j = str.length();
    char[] charArray = str.toCharArray();
    while((i<j) && charArray[i] <=' '){
        i++;
    }
    while((i<j) && charArray[j-1]<= ' '){
        j--;
    }
    return str.substring(i, j+1);

}

public static void main(String[] args) {
    System.out.println(trim("    abcd ght trip              "));

}

The destination array newChars is not large enough to hold the values copied. You need to initialize it to the length of the data you intend to copy (so, length - 1).

You can use Guava CharMatcher.

String outputString = CharMatcher.whitespace().trimFrom(inputString);

Note: This works because whitespace is all in the BMP.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!