Create a class that support json serialization for use with Celery

被刻印的时光 ゝ 提交于 2021-01-27 13:29:02

问题


I'm using Celery to run some background tasks. One of the tasks returns a python class I created. I want to use json to serialize and deserialize this class, given the warnings about using pickle.

Is there a simple built in way to achieve this?

The class is very simple, it contains 3 attributes all of which are lists of named tuples. It contains a couple of methods that performs some calculations on the attributes.

My idea is to serialize/deserialize the 3 attributes, since that defines the class.

This is my idea for the encoder, but I'm not sure how to decode the data again?

import json

class JSONSerializable(object):
    def __repr__(self):
        return json.dumps(self.__dict__)

class MySimpleClass(JSONSerializable):
    def __init__(self, p1, p2, p3): # I only care about p1, p2, p3
        self.p1 = p1
        self.p2 = p2
        self.p3 = p2
        self.abc = p1 + p2 + p2

    def some_calc(self):
        ...

回答1:


First but not least important: the warnings against pickle are mainly if you could have 3rd partis injecting pickled data on your worker stream. If you are certain your own system is creating all pickled data to be consumed, there is no security problem at all. And as for compatibility, it is relatively easy to handle, and automatic if you are on the same Python version for produers and consumers of your Pickle files.

That said, for JSON, you have to create a subclass of Python's json.JSONEncoder and json.JSONDecoder - each of which will need to be passed as the cls argument to all your json.dump(s) and json.load(s) calls.

A suggestion is that the default method on the encoder encodes the class __module__, its __name__ and a identifier key, say __custom__ to ensure it should be custom decoded, as keys to a dictionary, and the object's data as a "data" key.

And on the encoder, you check for the __custom__ key, and them instantiate a class using the __new__ method, and populate its dict. Like for pickle, side-effects that are triggered on the class __init__ won't run.

You can later on enhance your decoder and encoder so that, for example, they search the class for a __json_encode__ method that could handle only the desired attributes.

Sample implementation:

import json

class GenericJSONEncoder(json.JSONEncoder):
    def default(self, obj):
        try:
            return super().default(obj)
        except TypeError:
            pass
        cls = type(obj)
        result = {
            '__custom__': True,
            '__module__': cls.__module__,
            '__name__': cls.__name__,
            'data': obj.__dict__ if not hasattr(cls, '__json_encode__') else obj.__json_encode__
        }
        return result


class GenericJSONDecoder(json.JSONDecoder):
    def decode(self, str):
        result = super().decode(str)
        if not isinstance(result, dict) or not result.get('__custom__', False):
            return result
        import sys
        module = result['__module__']
        if not module in sys.modules:
            __import__(module)
        cls = getattr(sys.modules[module], result['__name__'])
        if hasattr(cls, '__json_decode__'):
            return cls.__json_decode__(result['data'])
        instance = cls.__new__(cls)
        instance.__dict__.update(result['data'])
        return instance

Interactive test on the console:

In [36]: class A:
    ...:     def __init__(self, a):
    ...:         self.a = a
    ...:         

In [37]: a = A('test')

In [38]: b = json.loads(json.dumps(a, cls=GenericJSONEncoder),  cls=GenericJSONDecoder)

In [39]: b.a
Out[39]: 'test'



回答2:


Here is an improved version of the great solution provided by @jsbueno which also works with nested custom types.

import json
import collections
import six

def is_iterable(arg):
    return isinstance(arg, collections.Iterable) and not isinstance(arg, six.string_types)


class GenericJSONEncoder(json.JSONEncoder):
    def default(self, obj):
        try:
            return super().default(obj)
        except TypeError:
            pass
        cls = type(obj)
        result = {
            '__custom__': True,
            '__module__': cls.__module__,
            '__name__': cls.__name__,
            'data': obj.__dict__ if not hasattr(cls, '__json_encode__') else obj.__json_encode__
        }
        return result


class GenericJSONDecoder(json.JSONDecoder):
    def decode(self, str):
        result = super().decode(str)
        return GenericJSONDecoder.instantiate_object(result)

    @staticmethod
    def instantiate_object(result):
        if not isinstance(result, dict):  # or
            if is_iterable(result):
                return [GenericJSONDecoder.instantiate_object(v) for v in result]
            else:
                return result

        if not result.get('__custom__', False):
            return {k: GenericJSONDecoder.instantiate_object(v) for k, v in result.items()}

        import sys
        module = result['__module__']
        if module not in sys.modules:
            __import__(module)
        cls = getattr(sys.modules[module], result['__name__'])
        if hasattr(cls, '__json_decode__'):
            return cls.__json_decode__(result['data'])
        instance = cls.__new__(cls)
        data = {k: GenericJSONDecoder.instantiate_object(v) for k, v in result['data'].items()}
        instance.__dict__.update(data)
        return instance


class C:

    def __init__(self):
        self.c = 133

    def __repr__(self):
        return "C<" + str(self.__dict__) + ">"


class B:

    def __init__(self):
        self.b = {'int': 123, "c": C()}
        self.l = [123, C()]
        self.t = (234, C())
        self.s = "Blah"

    def __repr__(self):
        return "B<" + str(self.__dict__) + ">"


class A:
    class_y = 13

    def __init__(self):
        self.x = B()

    def __repr__(self):
        return "A<" + str(self.__dict__) + ">"


def dumps(obj, *args, **kwargs):
    return json.dumps(obj, *args, cls=GenericJSONEncoder, **kwargs)


def dump(obj, *args, **kwargs):
    return json.dump(obj, *args, cls=GenericJSONEncoder, **kwargs)


def loads(obj, *args, **kwargs):
    return json.loads(obj, *args, cls=GenericJSONDecoder, **kwargs)


def load(obj, *args, **kwargs):
    return json.load(obj, *args, cls=GenericJSONDecoder, **kwargs)

Check it out:

e = dumps(A())
print("ENCODED:\n\n", e)
b = json.loads(e, cls=GenericJSONDecoder)
b = loads(e)
print("\nDECODED:\n\n", b)

Prints:

 A<{'x': B<{'b': {'int': 123, 'c': C<{'c': 133}>}, 'l': [123, C<{'c': 133}>], 't': [234, C<{'c': 133}>], 's': 'Blah'}>}>

The original version only reconstructs the A correctly while all instances of B and C are not instantiated but left as dicts:

A<{'x': {'__custom__': True, '__module__': '__main__', '__name__': 'B', 'data': {'b': {'int': 123, 'c': {'__custom__': True, '__module__': '__main__', '__name__': 'C', 'data': {'c': 133}}}, 'l': [123, {'__custom__': True, '__module__': '__main__', '__name__': 'C', 'data': {'c': 133}}], 't': [234, {'__custom__': True, '__module__': '__main__', '__name__': 'C', 'data': {'c': 133}}], 's': 'Blah'}}}>

Note that if the type contains an collection like list or tuple, the actual type of the collection can not be restored during decoding. This is because all those collections will be converted into lists when encoded to json.



来源:https://stackoverflow.com/questions/43092113/create-a-class-that-support-json-serialization-for-use-with-celery

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!