Python equivalent of Java StringBuffer?

后端 未结 8 1315
半阙折子戏
半阙折子戏 2020-12-04 21:10

Is there anything in Python like Java\'s StringBuffer? Since strings are immutable in Python too, editing them in loops would be inefficient.

8条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-04 21:15

    Efficient String Concatenation in Python is a rather old article and its main statement that the naive concatenation is far slower than joining is not valid anymore, because this part has been optimized in CPython since then:

    CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use the str.join() method which assures consistent linear concatenation performance across versions and implementations. @ http://docs.python.org/2/library/stdtypes.html

    I've adapted their code a bit and got the following results on my machine:

    from cStringIO import StringIO
    from UserString import MutableString
    from array import array
    
    import sys, timeit
    
    def method1():
        out_str = ''
        for num in xrange(loop_count):
            out_str += `num`
        return out_str
    
    def method2():
        out_str = MutableString()
        for num in xrange(loop_count):
            out_str += `num`
        return out_str
    
    def method3():
        char_array = array('c')
        for num in xrange(loop_count):
            char_array.fromstring(`num`)
        return char_array.tostring()
    
    def method4():
        str_list = []
        for num in xrange(loop_count):
            str_list.append(`num`)
        out_str = ''.join(str_list)
        return out_str
    
    def method5():
        file_str = StringIO()
        for num in xrange(loop_count):
            file_str.write(`num`)
        out_str = file_str.getvalue()
        return out_str
    
    def method6():
        out_str = ''.join([`num` for num in xrange(loop_count)])
        return out_str
    
    def method7():
        out_str = ''.join(`num` for num in xrange(loop_count))
        return out_str
    
    
    loop_count = 80000
    
    print sys.version
    
    print 'method1=', timeit.timeit(method1, number=10)
    print 'method2=', timeit.timeit(method2, number=10)
    print 'method3=', timeit.timeit(method3, number=10)
    print 'method4=', timeit.timeit(method4, number=10)
    print 'method5=', timeit.timeit(method5, number=10)
    print 'method6=', timeit.timeit(method6, number=10)
    print 'method7=', timeit.timeit(method7, number=10)
    

    Results:

    2.7.1 (r271:86832, Jul 31 2011, 19:30:53) 
    [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)]
    method1= 0.171155929565
    method2= 16.7158739567
    method3= 0.420584917068
    method4= 0.231794118881
    method5= 0.323612928391
    method6= 0.120429992676
    method7= 0.145267963409
    

    Conclusions:

    • join still wins over concat, but marginally
    • list comprehensions are faster than loops (when building a list)
    • joining generators is slower than joining lists
    • other methods are of no use (unless you're doing something special)

    py3:

    import sys
    import timeit
    from io import StringIO
    from array import array
    
    
    def test_concat():
        out_str = ''
        for _ in range(loop_count):
            out_str += 'abc'
        return out_str
    
    
    def test_join_list_loop():
        str_list = []
        for _ in range(loop_count):
            str_list.append('abc')
        return ''.join(str_list)
    
    
    def test_array():
        char_array = array('b')
        for _ in range(loop_count):
            char_array.frombytes(b'abc')
        return str(char_array.tostring())
    
    
    def test_string_io():
        file_str = StringIO()
        for _ in range(loop_count):
            file_str.write('abc')
        return file_str.getvalue()
    
    
    def test_join_list_compr():
        return ''.join(['abc' for _ in range(loop_count)])
    
    
    def test_join_gen_compr():
        return ''.join('abc' for _ in range(loop_count))
    
    
    loop_count = 80000
    
    print(sys.version)
    
    res = {}
    
    for k, v in dict(globals()).items():
        if k.startswith('test_'):
            res[k] = timeit.timeit(v, number=10)
    
    for k, v in sorted(res.items(), key=lambda x: x[1]):
        print('{:.5f} {}'.format(v, k))
    

    results

    3.7.5 (default, Nov  1 2019, 02:16:32) 
    [Clang 11.0.0 (clang-1100.0.33.8)]
    0.03738 test_join_list_compr
    0.05681 test_join_gen_compr
    0.09425 test_string_io
    0.09636 test_join_list_loop
    0.11976 test_concat
    0.19267 test_array
    

提交回复
热议问题