Python swapping out sys.modules does not work as intuited

淺唱寂寞╮ 提交于 2019-12-24 01:17:53

问题


I was experimenting with setting the dictionary sys.modules while working on an answer to another question and came across something interesting. The linked question deals with removing all the effects of importing a module. Based on another post, I came up with the idea of deleting all new modules from sys.modules after an import. My initial implementation was to do the following (testing with numpy as the module to load and unload):

# Load the module
import sys
mod_copy = sys.modules.copy()
print('numpy' in mod_copy, 'numpy' in sys.modules) # False False
import numpy
print('numpy' in mod_copy, 'numpy' in sys.modules) # False True
print(id(numpy)) # 45138472

The printouts show that numpy was imported successfully and that the shallow copy does not contain it, as expected.

Now my idea was to unload the module by swapping mod_copy back into sys.modules, then delete the local reference to the module. That should in theory remove all references to it (and possibly it does):

sys.modules = mod_copy
del numpy
print('numpy' in sys.modules) # False

This should be enough to be able to re-import the module, but when I do

import numpy
print('numpy' in sys.modules) # False
print(id(numpy)) # 45138472

It appears that the numpy module is not reloaded since it has the same id as before. It does not show up in sys.modules, despite the fact that the import statement raises no errors and appears to complete successfully (i.e., a numpy module exists in the local namespace).

On the other hand, the implementation that I made in my answer to the linked question does appear to work fine. It modifies the dictionary directly instead of swapping it out:

import sys
mod_copy = sys.modules.copy()
print('numpy' in mod_copy, 'numpy' in sys.modules) # False False
import numpy
print('numpy' in mod_copy, 'numpy' in sys.modules) # False True
print(id(numpy)) # 35963432

for m in list(sys.modules):
    if m not in mod_copy:
        del sys.modules[m]
del numpy
print('numpy' in sys.modules) # False

import numpy
print('numpy' in sys.modules) # True
print(id(numpy)) # (54941000 != 35963432)

I am using Python 3.5.2 on an Anaconda install. I am most interested in explanations focusing on Python 3, but I am curious about Python 2.7+ as well.

The only thing I can think of that is happening here is that sys maintains another reference to sys.modules and uses that internal reference regardless of what I do to the public one. I am not sure that this covers everything though, so I would like to know what is really going on.


回答1:


Even in Python 3.5, part of the import implementation is still written in C, and that part uses PyThreadState_GET()->interp->modules to retrieve the module cache, rather than going through the sys.modules attribute. Your import is finding numpy in the old sys.modules through one of those code paths.

sys.modules isn't designed to be replaced. The docs mention that replacing it may behave unexpectedly:

This can be manipulated to force reloading of modules and other tricks. However, replacing the dictionary will not necessarily work as expected and deleting essential items from the dictionary may cause Python to fail.



来源:https://stackoverflow.com/questions/42142660/python-swapping-out-sys-modules-does-not-work-as-intuited

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!