python copy.deepcopy lists seems shallow

杀马特。学长 韩版系。学妹 提交于 2019-12-01 12:07:21

The problem is that deepcopy keeps a memo that contains all instances that have been copied already. That's to avoid infinite recursions and intentional shared objects. So when it tries to deepcopy the second sublist it sees that it has already copied it (the first sublist) and just inserts the first sublist again. In short deepcopy doesn't solve the "shared sublist" problem!

To quote the documentation:

Two problems often exist with deep copy operations that don’t exist with shallow copy operations:

  • Recursive objects (compound objects that, directly or indirectly, contain a reference to themselves) may cause a recursive loop.
  • Because deep copy copies everything it may copy too much, such as data which is intended to be shared between copies.

The deepcopy() function avoids these problems by:

  • keeping a “memo” dictionary of objects already copied during the current copying pass; and
  • letting user-defined classes override the copying operation or the set of components copied.

(emphasis mine)

That means that deepcopy regards shared references as intention. For example consider the class:

from copy import deepcopy

class A(object):
    def __init__(self, x):
        self.x = x
        self.x1 = x[0]  # intentional sharing of the sublist with x attribute
        self.x2 = x[1]  # intentional sharing of the sublist with x attribute

a1 = A([[1, 2], [2, 3]])
a2 = deepcopy(a1)
a2.x1[0] = 10
print(a2.x)
# [[10, 2], [2, 3]]

Neglecting that the class doesn't make much sense as is it intentionally shares the references between its x and x1 and x2 attribute. It would be weird if deepcopy broke those shared references by doing a separate copy of each of these. That's why the documentation mentions this as a "solution" to the problem of "copy too much, such as data which is intended to be shared between copies.".

Back to your example: If you don't want to have shared references it would be better to avoid them completely:

m = [[0]*3 for _ in range(3)]

In your case the inner elements are immutable because 0 is immutable - but if you deal with mutable instances inside the innermost lists you must have to avoid the inner list multiplication as well:

m = [[0 for _ in range(3)] for _ in range(3)] 
lior_13

The thing is that you create a list of 3 times the same object, so when you assign a value in one of the lists, it affects all of them (because it is the same).

Try to do:

a = [[3*[0]] for i in range(3)]
m = copy.deepcopy(a)

Here you create "a", which is a list of 3 lists of size 3, initialized with 0's. Deep copying "a" will give you "m" - the same as "a", but different object, so that changing "a" will not affect "m" and vice versa.

After I read several answers I thought a bit more about this question. The problem lies in the fact that there is no way that a recursive (circular) object can be deeply copied! For instance:

x = [1]
x.append(x)

produces an object which behaves like an infinite sequence of 1's which Python prints as:

[1, [...]]

The same happens to deepcopy(x). In my opinion Python implementation adopted a solution which avoids infinite loops but may produce incorrect results for objects without circularities but with shared components. I'd rather see my program loop forever and fix it than have to search for an obscure bug!

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!