How to pickle a namedtuple instance correctly

可紊 提交于 2019-11-28 06:43:33

Create the named tuple outside of the function:

from collections import namedtuple
import pickle

P = namedtuple("P", "one two three four")

def pickle_test():
    my_list = []
    abe = P("abraham", "lincoln", "vampire", "hunter")
    my_list.append(abe)
    f = open('abe.pickle', 'w')
    pickle.dump(abe, f)
    f.close()

pickle_test()

Now pickle can find it; it is a module global now. When unpickling, all the pickle module has to do is locate __main__.P again. In your version, P is a local, to the pickle_test() function, and that is not introspectable or importable.

It is important to remember that namedtuple() is a class factory; you give it parameters and it returns a class object for you to create instances from. pickle only stores the data contained in the instances, plus a string reference to the original class to reconstruct the instances again.

After I added my question as a comment to the main answer I found a way to solve the problem of making a dynamically created namedtuple pickle-able. This is required in my case because I'm figuring out its fields only at runtime (after a DB query).

All I do is monkey patch the namedtuple by effectively moving it to the __main__ module:

def _CreateNamedOnMain(*args):
    import __main__
    namedtupleClass = collections.namedtuple(*args)
    setattr(__main__, namedtupleClass.__name__, namedtupleClass)
    namedtupleClass.__module__ = "__main__"
    return namedtupleClass

Mind that the namedtuple name (which is provided by args) might overwrite another member in __main__ if you're not careful.

Ruvalcaba

I found this answer in another thread. This is all about the naming of the named tuple. This worked for me:

group_t =            namedtuple('group_t', 'field1, field2')  # this will work
mismatched_group_t = namedtuple('group_t', 'field1, field2')  # this will throw the error

Alternatively, you can use cloudpickle or dill for serialization:

from collections import namedtuple

import cloudpickle
import dill



def dill_test(dynamic_names):
    P = namedtuple('P', dynamic_names)
    my_list = []
    abe = P("abraham", "lincoln", "vampire", "hunter")
    my_list.append(abe)
    with open('deleteme.cloudpickle', 'wb') as f:
        cloudpickle.dump(abe, f)
    with open('deleteme.dill', 'wb') as f:
        dill.dump(abe, f)


dill_test("one two three four")

The issue here is the child processes aren't able to import the class of the object -in this case, the class P-, in the case of a multi-model project the Class P should be importable anywhere the child process get used

a quick workaround is to make it importable by affecting it to globals()

globals()["P"] = P
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!