Does reusing a list slice to get length cost additional memory?

问题

I proposed a something in a comment in this answer. Martijn Pieters said that my suggestion would be memory intensive, and he's usually right, but I like to see things for myself, so I tried to profile it. Here's what I got:

#!/usr/bin/env python
""" interpolate.py """

from memory_profiler import profile

@profile
def interpolate1(alist):
    length = (1 + len(alist)) // 2
    alist[::2] = [0] * length

@profile
def interpolate2(alist):
    length = len(alist[::2])
    alist[::2] = [0] * length

a = []
b = []
for i in range(5, 9):
    print i
    exp = 10**i
    a[:] = range(exp)
    b[:] = range(exp)
    interpolate1(a)
    interpolate2(b)

I don't see any incremental difference in memory cost for the slice solution, but I sometimes see one for the arithmetic solution. Take the results at exp = 7, for example:

7
Filename: interpolate.py

Line #    Mem usage    Increment   Line Contents
================================================
     5    750.1 MiB      0.0 MiB   @profile
     6                             def interpolate1(alist):
     7    750.1 MiB      0.0 MiB       length = (1 + len(alist)) // 2
     8    826.4 MiB     76.3 MiB       alist[::2] = [0] * length


Filename: interpolate.py

Line #    Mem usage    Increment   Line Contents
================================================
    10    826.4 MiB      0.0 MiB   @profile
    11                             def interpolate2(alist):
    12    826.4 MiB      0.0 MiB       length = len(alist[::2])
    13    826.4 MiB      0.0 MiB       alist[::2] = [0] * length

I tried a few other approaches to profiling, including running interpolate2 before interpolate1, randomizing the run order, and much smaller lists, but the results are pretty consistent.

I can postulate that the results are because the memory is being allocated for the list slice either way, whether it's on the right or left side of the assignment, but any way you slice it, it looks like the slice solution breaks even with the arithmetic solution. Am I interpreting these results correctly?

回答1:

Yes, additional memory will be reserved for a new list object that is created just for the slice.

However, the list object is discarded again after querying the length. You just created a list object just to calculate how long half a list would be.

Memory allocations are relatively expensive, even if you then discard the object again. It is that cost I was referring to, while you are looking for a permanent memory footprint increase. However transient the list object might be, you still needed to allocate memory for this object.

The cost is immediately apparent when you use timeit to compare the two approaches:

>>> import timeit
>>> def calculate(alist):
...     (1 + len(alist)) // 2
... 
>>> def allocate(alist):
...     len(alist[::2])
... 
>>> testlist = range(10**5)
>>> timeit.timeit('f(testlist)', 'from __main__ import testlist, calculate as f', number=10000)
0.003368854522705078
>>> timeit.timeit('f(testlist)', 'from __main__ import testlist, allocate as f', number=10000)
2.7687110900878906

The slice only has to create a list object and copy across half the references, but that operation takes more that 800 times as long as simply calculating the length from the existing list.

Note that I actually had to reduce the timeit repetition count; the default 1 million repetitions was going to take an additional 4.5 minutes. I wasn't going to wait that long, while the straight calculation took a mere 0.18 seconds.

来源：https://stackoverflow.com/questions/27465514/does-reusing-a-list-slice-to-get-length-cost-additional-memory

标签

python

list

memory-management