Python: can't pickle module objects error

后端 未结 4 1818
不思量自难忘°
不思量自难忘° 2020-12-16 13:24

I\'m trying to pickle a big class and getting

TypeError: can\'t pickle module objects

despite looking around the web, I can\'t e

4条回答
  •  余生分开走
    2020-12-16 14:00

    Recursively Find Pickle Failure

    Inspired by wump's comment: Python: can't pickle module objects error

    Here is some quick code that helped me find the culprit recursively.

    It checks the object in question to see if it fails pickling.

    Then iterates trying to pickle the keys in __dict__ returning the list of only failed picklings.

    Code Snippet

    import pickle
    
    def pickle_trick(obj, max_depth=10):
        output = {}
    
        if max_depth <= 0:
            return output
    
        try:
            pickle.dumps(obj)
        except (pickle.PicklingError, TypeError) as e:
            failing_children = []
    
            if hasattr(obj, "__dict__"):
                for k, v in obj.__dict__.items():
                    result = pickle_trick(v, max_depth=max_depth - 1)
                    if result:
                        failing_children.append(result)
    
            output = {
                "fail": obj, 
                "err": e, 
                "depth": max_depth, 
                "failing_children": failing_children
            }
    
        return output
    
    

    Example Program

    import redis
    
    import pickle
    from pprint import pformat as pf
    
    
    def pickle_trick(obj, max_depth=10):
        output = {}
    
        if max_depth <= 0:
            return output
    
        try:
            pickle.dumps(obj)
        except (pickle.PicklingError, TypeError) as e:
            failing_children = []
    
            if hasattr(obj, "__dict__"):
                for k, v in obj.__dict__.items():
                    result = pickle_trick(v, max_depth=max_depth - 1)
                    if result:
                        failing_children.append(result)
    
            output = {
                "fail": obj, 
                "err": e, 
                "depth": max_depth, 
                "failing_children": failing_children
            }
    
        return output
    
    
    if __name__ == "__main__":
        r = redis.Redis()
        print(pf(pickle_trick(r)))
    
    

    Example Output

    $ python3 pickle-trick.py
    {'depth': 10,
     'err': TypeError("can't pickle _thread.lock objects"),
     'fail': Redis>>,
     'failing_children': [{'depth': 9,
                           'err': TypeError("can't pickle _thread.lock objects"),
                           'fail': ConnectionPool>,
                           'failing_children': [{'depth': 8,
                                                 'err': TypeError("can't pickle _thread.lock objects"),
                                                 'fail': ,
                                                 'failing_children': []},
                                                {'depth': 8,
                                                 'err': TypeError("can't pickle _thread.RLock objects"),
                                                 'fail': ,
                                                 'failing_children': []}]},
                          {'depth': 9,
                           'err': PicklingError("Can't pickle  at 0x10c1e8710>: attribute lookup Redis. on redis.client failed"),
                           'fail': {'ACL CAT':  at 0x10c1e89e0>,
                                    'ACL DELUSER': ,
    0x10c1e8170>,
                                    .........
                                    'ZSCORE': },
                           'failing_children': []}]}
    

    Root Cause - Redis can't pickle _thread.lock

    In my case, creating an instance of Redis that I saved as an attribute of an object broke pickling.

    When you create an instance of Redis it also creates a connection_pool of Threads and the thread locks can not be pickled.

    I had to create and clean up Redis within the multiprocessing.Process before it was pickled.

    Testing

    In my case, the class that I was trying to pickle, must be able to pickle. So I added a unit test that creates an instance of the class and pickles it. That way if anyone modifies the class so it can't be pickled, therefore breaking it's ability to be used in multiprocessing (and pyspark), we will detect that regression and know straight away.

    def test_can_pickle():
        # Given
        obj = MyClassThatMustPickle()
    
        # When / Then
        pkl = pickle.dumps(obj)
    
        # This test will throw an error if it is no longer pickling correctly
    
    

提交回复
热议问题