Python dataclass from a nested dict

后端 未结 10 756
孤街浪徒
孤街浪徒 2020-12-22 23:38

The standard library in 3.7 can recursively convert a dataclass into a dict (example from the docs):

from dataclasses import dataclass, asdict
from typing im         


        
10条回答
  •  执念已碎
    2020-12-23 00:03

    If your goal is to produce JSON from and to existing, predefined dataclasses, then just write custom encoder and decoder hooks. Do not use dataclasses.asdict() here, instead record in JSON a (safe) reference to the original dataclass.

    jsonpickle is not safe because it stores references to arbitrary Python objects and passes in data to their constructors. With such references I can get jsonpickle to reference internal Python data structures and create and execute functions, classes and modules at will. But that doesn't mean you can't handle such references unsafely. Just verify that you only import (not call) and then verify that the object is an actual dataclass type, before you use it.

    The framework can be made generic enough but still limited only to JSON-serialisable types plus dataclass-based instances:

    import dataclasses
    import importlib
    import sys
    
    def dataclass_object_dump(ob):
        datacls = type(ob)
        if not dataclasses.is_dataclass(datacls):
            raise TypeError(f"Expected dataclass instance, got '{datacls!r}' object")
        mod = sys.modules.get(datacls.__module__)
        if mod is None or not hasattr(mod, datacls.__qualname__):
            raise ValueError(f"Can't resolve '{datacls!r}' reference")
        ref = f"{datacls.__module__}.{datacls.__qualname__}"
        fields = (f.name for f in dataclasses.fields(ob))
        return {**{f: getattr(ob, f) for f in fields}, '__dataclass__': ref}
    
    def dataclass_object_load(d):
        ref = d.pop('__dataclass__', None)
        if ref is None:
            return d
        try:
            modname, hasdot, qualname = ref.rpartition('.')
            module = importlib.import_module(modname)
            datacls = getattr(module, qualname)
            if not dataclasses.is_dataclass(datacls) or not isinstance(datacls, type):
                raise ValueError
            return datacls(**d)
        except (ModuleNotFoundError, ValueError, AttributeError, TypeError):
            raise ValueError(f"Invalid dataclass reference {ref!r}") from None
    

    This uses JSON-RPC-style class hints to name the dataclass, and on loading this is verified to still be a data class with the same fields. No type checking is done on the values of the fields (as that's a whole different kettle of fish).

    Use these as the default and object_hook arguments to json.dump[s]() and json.dump[s]():

    >>> print(json.dumps(c, default=dataclass_object_dump, indent=4))
    {
        "mylist": [
            {
                "x": 0,
                "y": 0,
                "__dataclass__": "__main__.Point"
            },
            {
                "x": 10,
                "y": 4,
                "__dataclass__": "__main__.Point"
            }
        ],
        "__dataclass__": "__main__.C"
    }
    >>> json.loads(json.dumps(c, default=dataclass_object_dump), object_hook=dataclass_object_load)
    C(mylist=[Point(x=0, y=0), Point(x=10, y=4)])
    >>> json.loads(json.dumps(c, default=dataclass_object_dump), object_hook=dataclass_object_load) == c
    True
    

    or create instances of the JSONEncoder and JSONDecoder classes with those same hooks.

    Instead of using fully qualifying module and class names, you could also use a separate registry to map permissible type names; check against the registry on encoding, and again on decoding to ensure you don't forget to register dataclasses as you develop.

提交回复
热议问题