Why do tuples take less space in memory than lists?

前端未结

关注

 4  1809

-上瘾入骨i 2020-12-04 09:52

A tuple takes less memory space in Python:

>>> a = (1,2,3)
>>> a.__sizeof__()
48

whereas lists

4条回答

误落风尘 (楼主)

2020-12-04 10:07
I'll take a deeper dive into the CPython codebase so we can see how the sizes are actually calculated. In your specific example, no over-allocations have been performed, so I won't touch on that.

I'm going to use 64-bit values here, as you are.

The size for lists is calculated from the following function, list_sizeof:
```
static PyObject *
list_sizeof(PyListObject *self)
{
    Py_ssize_t res;

    res = _PyObject_SIZE(Py_TYPE(self)) + self->allocated * sizeof(void*);
    return PyInt_FromSsize_t(res);
}
```
Here Py_TYPE(self) is a macro that grabs the ob_type of self (returning PyList_Type) while _PyObject_SIZE is another macro that grabs tp_basicsize from that type. tp_basicsize is calculated as sizeof(PyListObject) where PyListObject is the instance struct.

The PyListObject structure has three fields:
```
PyObject_VAR_HEAD     # 24 bytes 
PyObject **ob_item;   #  8 bytes
Py_ssize_t allocated; #  8 bytes
```
these have comments (which I trimmed) explaining what they are, follow the link above to read them. PyObject_VAR_HEAD expands into three 8 byte fields (ob_refcount, ob_type and ob_size) so a 24 byte contribution.

So for now res is:
```
sizeof(PyListObject) + self->allocated * sizeof(void*)
```
or:
```
40 + self->allocated * sizeof(void*)
```
If the list instance has elements that are allocated. the second part calculates their contribution. self->allocated, as it's name implies, holds the number of allocated elements.

Without any elements, the size of lists is calculated to be:
```
>>> [].__sizeof__()
40
```
i.e the size of the instance struct.

tuple objects don't define a tuple_sizeof function. Instead, they use object_sizeof to calculate their size:
```
static PyObject *
object_sizeof(PyObject *self, PyObject *args)
{
    Py_ssize_t res, isize;

    res = 0;
    isize = self->ob_type->tp_itemsize;
    if (isize > 0)
        res = Py_SIZE(self) * isize;
    res += self->ob_type->tp_basicsize;

    return PyInt_FromSsize_t(res);
}
```
This, as for lists, grabs the tp_basicsize and, if the object has a non-zero tp_itemsize (meaning it has variable-length instances), it multiplies the number of items in the tuple (which it gets via Py_SIZE) with tp_itemsize.

tp_basicsize again uses sizeof(PyTupleObject) where the PyTupleObject struct contains:
```
PyObject_VAR_HEAD       # 24 bytes 
PyObject *ob_item[1];   # 8  bytes
```
So, without any elements (that is, Py_SIZE returns 0) the size of empty tuples is equal to sizeof(PyTupleObject):
```
>>> ().__sizeof__()
24
```
huh? Well, here's an oddity which I haven't found an explanation for, the tp_basicsize of tuples is actually calculated as follows:
```
sizeof(PyTupleObject) - sizeof(PyObject *)
```
why an additional 8 bytes is removed from tp_basicsize is something I haven't been able to find out. (See MSeifert's comment for a possible explanation)

But, this is basically the difference in your specific example. lists also keep around a number of allocated elements which helps determine when to over-allocate again.

Now, when additional elements are added, lists do indeed perform this over-allocation in order to achieve O(1) appends. This results in greater sizes as MSeifert's covers nicely in his answer.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...