Checking fork behaviour in python multiprocessing on Linux systems

时光毁灭记忆、已成空白 提交于 2019-12-06 13:41:11

Your confusion seems to be cause by misunderstanding how processes and fork work. Each process has its own address space and so two processes can use the same addresses without conflict. This also means a process can't access the memory of another process unless the same memory is mapped into both processes.

When a process invokes the fork system call, the operating system creates a new child process that's a clone of the parent process. This clone, like any other process, has it's own address space distinct from its parent. However the contents of the address space are an exact copy of the parent's. This used to be accomplished by copying the memory of the parent process into new memory allocated for the child. This means once the child and parent resume executing after the fork any modifications either process makes to their own memory doesn't affect the other.

However, copying the entire address space of a process is an expensive operation, and is usually a waste. Most of the time the new process immediately executes a new program which results in the child's address space being replaced completely. So instead modern Unix-like operating systems use a "copy-on-write" fork implementation. Instead of copying the memory of the parent process the parent's memory is mapped into the child so they can share the same memory. However, the old semantics are still maintained. If either the child or the parent modify the shared memory then the page modified is copied so that the two processes no longer share that page of memory.

When the multiprocessing module calls your f function it does so in a child process that was created by using the fork system call. Since this child process is a clone of the parent, it also has a global variable named l which refers to a list which has the same ID (address) and same contents in both processes. That is, until you modify the list referred by l in the child process. The ID doesn't (and can't) change, but child's version of the list is no longer the same as the parent's. The contents of the parent's list are unaffected the modification made by the child.

Note that behaviour described in previous paragraph is true whether fork uses copy-on-write or not. As far as the multiprocessing module and Python in general are concerned that's just an implementation detail. The effective result is the same regardless. This mean you can't really test in a Python program which fork implementation is used.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!