Avoiding unnecessary slice copying in Python

随声附和 提交于 2019-12-12 10:39:30

问题


Is there a common idiom for avoiding pointless slice copying for cases like this:

>>> a = bytearray(b'hello')
>>> b = bytearray(b'goodbye, cruel world.')
>>> a.extend(b[14:20])
>>> a
bytearray(b'hello world')

It seems to me that there is an unnecessary copy happening when the b[14:20] slice is created. Rather than create a new slice in memory to give to extend I want to say "use only this range of the current object".

Some methods will help you out with slice parameters, for example count:

>>> a = bytearray(1000000)       # a million zero bytes
>>> a[0:900000].count(b'\x00')   # expensive temporary slice
900000
>>> a.count(b'\x00', 0, 900000)  # helpful start and end parameters
900000

but many, like extend in my first example, don't have this feature.

I realise that for many applications what I'm talking about would be a micro-optimisation, so before anyone asks - yes, I have profiled my application, and it is something worth worrying about for my case.

I have one 'solution' below, but any better ideas are most welcome.


回答1:


Creating a buffer object avoids copying the slice, but for short slices it's more efficient to just make the copy:

>>> a.extend(buffer(b, 14, 6))
>>> a
bytearray(b'hello world')

Here there's only one copy made of the memory, but the cost of creating the buffer object more than obliterates the saving. It should be better for larger slices though. I'm not sure how large the slice would have to be for this method to be more efficient overall.

Note that for Python 3 (and optionally in Python 2.7) you'd need a memoryview object instead:

>>> a.extend(memoryview(b)[14:20])



回答2:


itertools has islice. islice doesn't have a count method so it is useful in other cases where you wish to avoid copying the slice. As you pointed out - count has a mechanism for that anyway

>>> from itertools import islice
>>> a = bytearray(1000000)
>>> sum(1 for x in islice(a,0,900000) if x==0)
900000
>>> len(filter(b'\x00'.__eq__,islice(a,0,900000)))
900000

>>> a=bytearray(b"hello")
>>> b = bytearray(b'goodbye, cruel world.')
>>> a.extend(islice(b,14,20))
>>> a
bytearray(b'hello world')


来源:https://stackoverflow.com/questions/2328171/avoiding-unnecessary-slice-copying-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!