Faster alternatives to replace method in a Java String?

瘦欲@ 提交于 2020-01-09 18:22:50

问题


The fact that the replace method returns a string object rather than replacing the contents of a given string is a little obtuse (but understandable when you know that strings are immutable in Java). I am taking a major performance hit by using a deeply nested replace in some code. Is there something I can replace it with that would make it faster?


回答1:


This is what StringBuilder is meant for. If you're going to be doing a lot of manipulation, do it on a StringBuilder, then turn that into a String whenever you need to.

StringBuilder is described thus:

"A mutable sequence of characters. This class provides an API compatible with StringBuffer, but with no guarantee of synchronization".

It has replace (and append, insert, delete, et al) and you can use toString to morph it into a real String.




回答2:


The previous posts are right, StringBuilder/StringBuffer are a solution.

But, you also have to question if it is a good idea to do the replace on big Strings in memory.

I often have String manipulations that are implemented as a stream, so instead of replacing it in the string and then sending it to an OutputStream, I do the replace at the moment that I send the String to the outputstream. That works much faster than any replace.

This works much faster if you want this replace to implement a template mechanism. Streaming is always faster since you consume less memory and if the clients is slow, you only need to generate at a slow pace - so it scales much better.




回答3:


The follow code is approx. 30 times faster if there is no match and 5 times faster if there is a match.

static String fastReplace( String str, String target, String replacement ) {
    int targetLength = target.length();
    if( targetLength == 0 ) {
        return str;
    }
    int idx2 = str.indexOf( target );
    if( idx2 < 0 ) {
        return str;
    }
    StringBuilder buffer = new StringBuilder( targetLength > replacement.length() ? str.length() : str.length() * 2 );
    int idx1 = 0;
    do {
        buffer.append( str, idx1, idx2 );
        buffer.append( replacement );
        idx1 = idx2 + targetLength;
        idx2 = str.indexOf( target, idx1 );
    } while( idx2 > 0 );
    buffer.append( str, idx1, str.length() );
    return buffer.toString();
}



回答4:


I agree with the above. Use StringBuffer for thread-safety and StringBuilder when working with single threads.




回答5:


Adding to the @paxdiablo answer, here's a sample implementation of a replaceAll using StringBuffers that is a ~3.7 times faster than String.replaceAll():

Code:

public static String replaceAll(final String str, final String searchChars, String replaceChars)
{
  if ("".equals(str) || "".equals(searchChars) || searchChars.equals(replaceChars))
  {
    return str;
  }
  if (replaceChars == null)
  {
    replaceChars = "";
  }
  final int strLength = str.length();
  final int searchCharsLength = searchChars.length();
  StringBuilder buf = new StringBuilder(str);
  boolean modified = false;
  for (int i = 0; i < strLength; i++)
  {
    int start = buf.indexOf(searchChars, i);

    if (start == -1)
    {
      if (i == 0)
      {
        return str;
      }
      return buf.toString();
    }
    buf = buf.replace(start, start + searchCharsLength, replaceChars);
    modified = true;

  }
  if (!modified)
  {
    return str;
  }
  else
  {
    return buf.toString();
  }
}

Test Case -- the output is the following (Delta1 = 1917009502; Delta2 =7241000026):

@Test
public void testReplaceAll() 
{
  String origStr = "1234567890-1234567890-";

  String replacement1 =  StringReplacer.replaceAll(origStr, "0", "a");
  String expectedRep1 = "123456789a-123456789a-";

  String replacement2 =  StringReplacer.replaceAll(origStr, "0", "ab");
  String expectedRep2 = "123456789ab-123456789ab-";

  String replacement3 =  StringReplacer.replaceAll(origStr, "0", "");
  String expectedRep3 = "123456789-123456789-";


  String replacement4 =  StringReplacer.replaceAll(origStr, "012", "a");
  String expectedRep4 = "1234567890-1234567890-";

  String replacement5 =  StringReplacer.replaceAll(origStr, "123", "ab");
  String expectedRep5 = "ab4567890-ab4567890-";

  String replacement6 =  StringReplacer.replaceAll(origStr, "123", "abc");
  String expectedRep6 = "abc4567890-abc4567890-";

  String replacement7 =  StringReplacer.replaceAll(origStr, "123", "abcdd");
  String expectedRep7 = "abcdd4567890-abcdd4567890-";

  String replacement8 =  StringReplacer.replaceAll(origStr, "123", "");
  String expectedRep8 = "4567890-4567890-";

  String replacement9 =  StringReplacer.replaceAll(origStr, "123", "");
  String expectedRep9 = "4567890-4567890-";

  assertEquals(replacement1, expectedRep1);
  assertEquals(replacement2, expectedRep2);
  assertEquals(replacement3, expectedRep3);
  assertEquals(replacement4, expectedRep4);
  assertEquals(replacement5, expectedRep5);
  assertEquals(replacement6, expectedRep6);
  assertEquals(replacement7, expectedRep7);
  assertEquals(replacement8, expectedRep8);
  assertEquals(replacement9, expectedRep9);

  long start1 = System.nanoTime();
  for (long i = 0; i < 10000000L; i++)
  {
    String rep =  StringReplacer.replaceAll(origStr, "123", "abcdd");
  }
  long delta1 = System.nanoTime() -start1;

  long start2= System.nanoTime();

  for (long i = 0; i < 10000000L; i++)
  {
    String rep =  origStr.replaceAll( "123", "abcdd");
  }

  long delta2 = System.nanoTime() -start1;

  assertTrue(delta1 < delta2);

  System.out.printf("Delta1 = %d; Delta2 =%d", delta1, delta2);


}



回答6:


If you have a number of strings to replace (such as XML escape sequences), especially where the replacements are different length from the pattern, FSM lexer type algorithm seems like it might be most efficient, similar to the suggestion of processing in a stream fashion, where the output is incrementally built.

Perhaps a Matcher object could be used to do that efficiently.




回答7:


Just get the char[] of the String and iterate through it. Use a temporary StringBuilder.

Look for the pattern you want to replace while iterating if you don't find the pattern, write the stuff you scanned to the StringBuilder, else write the replacement text to the StringBuilder.




回答8:


All string manipulation in general are very slow. Consider to use StringBuffer, it's not exactly like the String class, but have a lot in common and it's mutable as well.




回答9:


When you're replacing single characters, consider iterating over your character array but replace characters by using a (pre-created) HashMap<Character, Character>().

I use this strategy to convert an integer exponent string by unicode superscript characters.

It's about twice as fast compared to String.replace(char, char). Note that the time associated to creating the hash map isn't included in this comparison.




回答10:


Because String.replace(CharSequence target, CharSequence replacement) has Pattern.compile, matcher, replaceAll inside, one can slightly optimize it by for using precompiled target pattern constant, like this:

private static final Pattern COMMA_REGEX = Pattern.compile(",");
...
COMMA_REGEX.matcher(value).replaceAll(replacement);


来源:https://stackoverflow.com/questions/1010928/faster-alternatives-to-replace-method-in-a-java-string

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!