Why do tuples take less space in memory than lists?

前端 未结 4 1809
-上瘾入骨i
-上瘾入骨i 2020-12-04 09:52

A tuple takes less memory space in Python:

>>> a = (1,2,3)
>>> a.__sizeof__()
48

whereas lists

4条回答
  •  误落风尘
    2020-12-04 10:07

    I'll take a deeper dive into the CPython codebase so we can see how the sizes are actually calculated. In your specific example, no over-allocations have been performed, so I won't touch on that.

    I'm going to use 64-bit values here, as you are.


    The size for lists is calculated from the following function, list_sizeof:

    static PyObject *
    list_sizeof(PyListObject *self)
    {
        Py_ssize_t res;
    
        res = _PyObject_SIZE(Py_TYPE(self)) + self->allocated * sizeof(void*);
        return PyInt_FromSsize_t(res);
    }
    

    Here Py_TYPE(self) is a macro that grabs the ob_type of self (returning PyList_Type) while _PyObject_SIZE is another macro that grabs tp_basicsize from that type. tp_basicsize is calculated as sizeof(PyListObject) where PyListObject is the instance struct.

    The PyListObject structure has three fields:

    PyObject_VAR_HEAD     # 24 bytes 
    PyObject **ob_item;   #  8 bytes
    Py_ssize_t allocated; #  8 bytes
    

    these have comments (which I trimmed) explaining what they are, follow the link above to read them. PyObject_VAR_HEAD expands into three 8 byte fields (ob_refcount, ob_type and ob_size) so a 24 byte contribution.

    So for now res is:

    sizeof(PyListObject) + self->allocated * sizeof(void*)
    

    or:

    40 + self->allocated * sizeof(void*)
    

    If the list instance has elements that are allocated. the second part calculates their contribution. self->allocated, as it's name implies, holds the number of allocated elements.

    Without any elements, the size of lists is calculated to be:

    >>> [].__sizeof__()
    40
    

    i.e the size of the instance struct.


    tuple objects don't define a tuple_sizeof function. Instead, they use object_sizeof to calculate their size:

    static PyObject *
    object_sizeof(PyObject *self, PyObject *args)
    {
        Py_ssize_t res, isize;
    
        res = 0;
        isize = self->ob_type->tp_itemsize;
        if (isize > 0)
            res = Py_SIZE(self) * isize;
        res += self->ob_type->tp_basicsize;
    
        return PyInt_FromSsize_t(res);
    }
    

    This, as for lists, grabs the tp_basicsize and, if the object has a non-zero tp_itemsize (meaning it has variable-length instances), it multiplies the number of items in the tuple (which it gets via Py_SIZE) with tp_itemsize.

    tp_basicsize again uses sizeof(PyTupleObject) where the PyTupleObject struct contains:

    PyObject_VAR_HEAD       # 24 bytes 
    PyObject *ob_item[1];   # 8  bytes
    

    So, without any elements (that is, Py_SIZE returns 0) the size of empty tuples is equal to sizeof(PyTupleObject):

    >>> ().__sizeof__()
    24
    

    huh? Well, here's an oddity which I haven't found an explanation for, the tp_basicsize of tuples is actually calculated as follows:

    sizeof(PyTupleObject) - sizeof(PyObject *)
    

    why an additional 8 bytes is removed from tp_basicsize is something I haven't been able to find out. (See MSeifert's comment for a possible explanation)


    But, this is basically the difference in your specific example. lists also keep around a number of allocated elements which helps determine when to over-allocate again.

    Now, when additional elements are added, lists do indeed perform this over-allocation in order to achieve O(1) appends. This results in greater sizes as MSeifert's covers nicely in his answer.

提交回复
热议问题