Multiprocessing: why is a numpy array shared with the child processes, while a list is copied?

烂漫一生 提交于 2019-12-04 12:52:25

They're all copy-on-write. What you're missing is that when you do, e.g.,

for x in data:
    pass

the reference count on every object contained in data is temporarily incremented by 1, one at a time, as x is bound to each object in turn. For int objects, the refcount in CPython is part of the basic object layout, so the object gets copied (you did mutate it, because the refcount changes).

To make something more analogous to the numpy.ones case, try, e.g.,

data = [1] * 10**8

Then there's only a single unique object referenced many (10**8) times by the list, so there's very little to copy (the same object's refcount gets incremented and decremented many times).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!