charAt() or substring? Which is faster?

后端 未结 6 2372
天命终不由人
天命终不由人 2020-12-15 10:07

I want to go through each character in a String and pass each character of the String as a String to another function.

String s = \"abcdefg\";
for(int i = 0         


        
相关标签:
6条回答
  • 2020-12-15 10:40

    As usual: it doesn't matter but if you insist on spending time on micro-optimization or if you really like to optimize for your very special use case, try this:

    import org.junit.Assert;
    import org.junit.Test;
    
    public class StringCharTest {
    
        // Times:
        // 1. Initialization of "s" outside the loop
        // 2. Init of "s" inside the loop
        // 3. newFunction() actually checks the string length,
        // so the function will not be optimized away by the hotstop compiler
    
        @Test
        // Fastest: 237ms / 562ms / 2434ms
        public void testCacheStrings() throws Exception {
            // Cache all possible Char strings
            String[] char2string = new String[Character.MAX_VALUE];
            for (char i = Character.MIN_VALUE; i < Character.MAX_VALUE; i++) {
                char2string[i] = Character.toString(i);
            }
    
            for (int x = 0; x < 10000000; x++) {
                char[] s = "abcdefg".toCharArray();
                for (int i = 0; i < s.length; i++) {
                    newFunction(char2string[s[i]]);
                }
            }
        }
    
        @Test
        // Fast: 1687ms / 1725ms / 3382ms
        public void testCharToString() throws Exception {
            for (int x = 0; x < 10000000; x++) {
                String s = "abcdefg";
                for (int i = 0; i < s.length(); i++) {
                    // Fast: Creates new String objects, but does not copy an array
                    newFunction(Character.toString(s.charAt(i)));
                }
            }
        }
    
        @Test
        // Very fast: 1331 ms/ 1414ms / 3190ms
        public void testSubstring() throws Exception {
            for (int x = 0; x < 10000000; x++) {
                String s = "abcdefg";
                for (int i = 0; i < s.length(); i++) {
                    // The fastest! Reuses the internal char array
                    newFunction(s.substring(i, i + 1));
                }
            }
        }
    
        @Test
        // Slowest: 2525ms / 2961ms / 4703ms
        public void testNewString() throws Exception {
            char[] value = new char[1];
            for (int x = 0; x < 10000000; x++) {
                char[] s = "abcdefg".toCharArray();
                for (int i = 0; i < s.length; i++) {
                    value[0] = s[i];
                    // Slow! Copies the array
                    newFunction(new String(value));
                }
            }
        }
    
        private void newFunction(String string) {
            // Do something with the one-character string
            Assert.assertEquals(1, string.length());
        }
    
    }
    
    0 讨论(0)
  • 2020-12-15 10:42

    The answer is: it doesn't matter.

    Profile your code. Is this your bottleneck?

    0 讨论(0)
  • 2020-12-15 10:43

    Does newFunction really need to take a String? It would be better if you could make newFunction take a char and call it like this:

    newFunction(s.charAt(i));
    

    That way, you avoid creating a temporary String object.

    To answer your question: It's hard to say which one is more efficient. In both examples, a String object has to be created which contains only one character. Which is more efficient depends on how exactly String.substring(...) and Character.toString(...) are implemented on your particular Java implementation. The only way to find it out is running your program through a profiler and seeing which version uses more CPU and/or more memory. Normally, you shouldn't worry about micro-optimizations like this - only spend time on this when you've discovered that this is the cause of a performance and/or memory problem.

    0 讨论(0)
  • 2020-12-15 10:43

    Leetcode seems to prefer the substring option here.

    This is how I solved that problem:

    class Solution {
    public int strStr(String haystack, String needle) {
        if(needle.length() == 0) {
            return 0;
        }
    
        if(haystack.length() == 0) {
            return -1;
        }
    
        for(int i=0; i<=haystack.length()-needle.length(); i++) {
            int count = 0;
            for(int j=0; j<needle.length(); j++) {
                if(haystack.charAt(i+j) == needle.charAt(j)) {
                    count++;
                }
            }
            if(count == needle.length()) {
                return i;
            }
        }
        return -1;
    }
    

    }

    And this is the optimal solution they give:

    class Solution {
    public int strStr(String haystack, String needle) {
        int length;
        int n=needle.length();
        int h=haystack.length();
        if(n==0)
            return 0;
        // if(n==h)
        //     length = h;
        // else
            length = h-n;
        if(h==n && haystack.charAt(0)!=needle.charAt(0))
                return -1;
        for(int i=0; i<=length; i++){
            if(haystack.substring(i, i+needle.length()).equals(needle))
                return i;
        }
        return -1;
    }
    

    }

    Honestly, I can't figure out why it would matter.

    0 讨论(0)
  • 2020-12-15 10:44

    Of the two snippets you've posted, I wouldn't want to say. I'd agree with Will that it almost certainly is irrelevant in the overall performance of your code - and if it's not, you can just make the change and determine for yourself which is fastest for your data with your JVM on your hardware.

    That said, it's likely that the second snippet would be better if you converted the String into a char array first, and then performed your iterations over the array. Doing it this way would perform the String overhead once only (converting to the array) instead of every call. Additionally, you could then pass the array directly to the String constructor with some indices, which is more efficient than taking a char out of an array to pass it individually (which then gets turned into a one character array):

    String s = "abcdefg";
    char[] chars = s.toCharArray();
    for(int i = 0; i < chars.length; i++) {
        newFunction(String.valueOf(chars, i, 1));
    }
    

    But to reinforce my first point, when you look at what you're actually avoiding on each call of String.charAt() - it's two bounds checks, a (lazy) boolean OR, and an addition. This is not going to make any noticeable difference. Neither is the difference in the String constructors.

    Essentially, both idioms are fine in terms of performance (neither is immediately obviously inefficient) so you should not spend any more time working on them unless a profiler shows that this takes up a large amount of your application's runtime. And even then you could almost certainly get more performance gains by restructuring your supporting code in this area (e.g. have newFunction take the whole string itself); java.lang.String is pretty well optimised by this point.

    0 讨论(0)
  • 2020-12-15 10:51

    I would first obtain the underlying char[] from the source String using String.toCharArray() and then proceed to call newFunction.

    But I do agree with Jesper that it would be best if you could just deal with characters and avoid all the String functions...

    0 讨论(0)
提交回复
热议问题