Unpickling python objects with a changed module path

后端 未结 4 1742
梦如初夏
梦如初夏 2020-12-14 02:29

I\'m trying to integrate a project Project A built by a colleague into another python project. Now this colleague has not used relative imports in his code but

相关标签:
4条回答
  • 2020-12-14 02:39

    In addition to @MartinPieters answer the other way of doing this is to define the find_global method of the cPickle.Unpickler class, or extend the pickle.Unpickler class.

    def map_path(mod_name, kls_name):
        if mod_name.startswith('packageA'): # catch all old module names
            mod = __import__('WrapperPackage.%s'%mod_name, fromlist=[mod_name])
            return getattr(mod, kls_name)
        else:
            mod = __import__(mod_name)
            return getattr(mod, kls_name)
    
    import cPickle as pickle
    with open('dump.pickle','r') as fh:
        unpickler = pickle.Unpickler(fh)
        unpickler.find_global = map_path
        obj = unpickler.load() # object will now contain the new class path reference
    
    with open('dump-new.pickle','w') as fh:
        pickle.dump(obj, fh) # ClassA will now have a new path in 'dump-new'
    

    A more detailed explanation of the process for both pickle and cPickle can be found here.

    0 讨论(0)
  • 2020-12-14 02:52

    You'll need to create an alias for the pickle import to work; the following to the __init__.py file of the WrapperPackage package:

    from .packageA import * # Ensures that all the modules have been loaded in their new locations *first*.
    from . import packageA  # imports WrapperPackage/packageA
    import sys
    sys.modules['packageA'] = packageA  # creates a packageA entry in sys.modules
    

    It may be that you'll need to create additional entries though:

    sys.modules['packageA.moduleA'] = moduleA
    # etc.
    

    Now cPickle will find packageA.moduleA and packageA.moduleB again at their old locations.

    You may want to re-write the pickle file afterwards, the new module location will be used at that time. The additional aliases created above should ensure that the modules in question have the new location name for cPickle to pick up when writing the classes again.

    0 讨论(0)
  • 2020-12-14 02:53

    One possible solution is to directly edit the pickle file (if you have access). I ran into this same problem of a changed module path, and I had saved the files as pickle.HIGHEST_PROTOCOL so it should be binary in theory, but the module path was sitting at the top of the pickle file in plain text. So I just did a find replace on all of the instances of the old module path with the new one and voila, they loaded correctly.

    I'm sure this solution is not for everyone, especially if you have a very complex pickled object, but it is a quick and dirty data fix that worked for me!

    0 讨论(0)
  • 2020-12-14 02:53

    This is my basic pattern for flexible unpickling - via an unambiguous and fast transition map - as there are usually just a few known classes besides the primitive data-types relevant for pickling. This also protects unpickling against erroneous or maliciously constructed data, which after all can execute arbitrary python code (!) upon a simple pickle.load() (with or without error-prone sys.modules fiddling).

    Python 2 & 3:

    from __future__ import print_function
    try:    
        import cPickle as pickle, copy_reg as copyreg
    except: 
        import pickle, copyreg
    
    class OldZ:
        a = 1
    class Z(object):
        a = 2
    class Dangerous:
        pass   
    
    _unpickle_map_safe = {
        # all possible and allowed (!) classes & upgrade paths    
        (__name__, 'Z')         : Z,    
        (__name__, 'OldZ')      : Z,
        ('old.package', 'OldZ') : Z,
        ('__main__', 'Z')       : Z,
        ('__main__', 'OldZ')    : Z,
        # basically required
        ('copy_reg', '_reconstructor') : copyreg._reconstructor,    
        ('__builtin__', 'object')      : copyreg._reconstructor,    
        }
    
    def unpickle_find_class(modname, clsname):
        print("DEBUG unpickling: %(modname)s . %(clsname)s" % locals())
        try: 
            return _unpickle_map_safe[(modname, clsname)]
        except KeyError:
            raise pickle.UnpicklingError(
                "%(modname)s . %(clsname)s not allowed" % locals())
    if pickle.__name__ == 'cPickle':  # PY2
        def SafeUnpickler(f):
            u = pickle.Unpickler(f)
            u.find_global = unpickle_find_class
            return u
    else:  # PY3 & Python2-pickle.py
        class SafeUnpickler(pickle.Unpickler):  
            find_class = staticmethod(unpickle_find_class)
    
    def test(fn='./z.pkl'):
        z = OldZ()
        z.b = 'teststring' + sys.version
        pickle.dump(z, open(fn, 'wb'), 2)
        pickle.dump(Dangerous(), open(fn + 'D', 'wb'), 2)
        # load again
        o = SafeUnpickler(open(fn, 'rb')).load()
        print(pickle, "loaded:", o, o.a, o.b)
        assert o.__class__ is Z
        try: 
            raise SafeUnpickler(open(fn + 'D', 'rb')).load() and AssertionError
        except pickle.UnpicklingError: 
            print('OK: Dangerous not allowed')
    
    if __name__ == '__main__':
        test()
    
    0 讨论(0)
提交回复
热议问题