Class inheritance in Python 3.7 dataclasses

前端 未结 8 746
离开以前
离开以前 2020-11-28 22:12

I\'m currently trying my hands on the new dataclass constructions introduced in Python 3.7. I am currently stuck on trying to do some inheritance of a parent class. It looks

相关标签:
8条回答
  • 2020-11-28 22:29

    You can use attributes with defaults in parent classes if you exclude them from the init function. If you need the possibility to override the default at init, extend the code with the answer of Praveen Kulkarni.

    from dataclasses import dataclass, field
    
    @dataclass
    class Parent:
        name: str
        age: int
        ugly: bool = field(default=False, init=False)
    
    @dataclass
    class Child(Parent):
        school: str
    
    jack = Parent('jack snr', 32)
    jack_son = Child('jack jnr', 12, school = 'havard')
    jack_son.ugly = True
    
    0 讨论(0)
  • 2020-11-28 22:36

    based on Martijn Pieters solution I did the following:

    1) Create a mixing implementing the post_init

    from dataclasses import dataclass
    
    no_default = object()
    
    
    @dataclass
    class NoDefaultAttributesPostInitMixin:
    
        def __post_init__(self):
            for key, value in self.__dict__.items():
                if value is no_default:
                    raise TypeError(
                        f"__init__ missing 1 required argument: '{key}'"
                    )
    

    2) Then in the classes with the inheritance problem:

    from src.utils import no_default, NoDefaultAttributesChild
    
    @dataclass
    class MyDataclass(DataclassWithDefaults, NoDefaultAttributesPostInitMixin):
        attr1: str = no_default
    

    EDIT:

    After a time I also find problems with this solution with mypy, the following code fix the issue.

    from dataclasses import dataclass
    from typing import TypeVar, Generic, Union
    
    T = TypeVar("T")
    
    
    class NoDefault(Generic[T]):
        ...
    
    
    NoDefaultVar = Union[NoDefault[T], T]
    no_default: NoDefault = NoDefault()
    
    
    @dataclass
    class NoDefaultAttributesPostInitMixin:
        def __post_init__(self):
            for key, value in self.__dict__.items():
                if value is NoDefault:
                    raise TypeError(f"__init__ missing 1 required argument: '{key}'")
    
    
    @dataclass
    class Parent(NoDefaultAttributesPostInitMixin):
        a: str = ""
    
    @dataclass
    class Child(Foo):
        b: NoDefaultVar[str] = no_default
    
    0 讨论(0)
  • 2020-11-28 22:38

    A possible work-around is to use monkey-patching to append the parent fields

    import dataclasses as dc
    
    def add_args(parent): 
        def decorator(orig):
            "Append parent's fields AFTER orig's fields"
    
            # Aggregate fields
            ff  = [(f.name, f.type, f) for f in dc.fields(dc.dataclass(orig))]
            ff += [(f.name, f.type, f) for f in dc.fields(dc.dataclass(parent))]
    
            new = dc.make_dataclass(orig.__name__, ff)
            new.__doc__ = orig.__doc__
    
            return new
        return decorator
    
    class Animal:
        age: int = 0 
    
    @add_args(Animal)
    class Dog:
        name: str
        noise: str = "Woof!"
    
    @add_args(Animal)
    class Bird:
        name: str
        can_fly: bool = True
    
    Dog("Dusty", 2)               # --> Dog(name='Dusty', noise=2, age=0)
    b = Bird("Donald", False, 40) # --> Bird(name='Donald', can_fly=False, age=40)
    

    It's also possible to prepend non-default fields, by checking if f.default is dc.MISSING, but this is probably too dirty.

    While monkey-patching lacks some features of inheritance, it can still be used to add methods to all pseudo-child classes.

    For more fine-grained control, set the default values using dc.field(compare=False, repr=True, ...)

    0 讨论(0)
  • 2020-11-28 22:41

    I came back to this question after discovering that dataclasses may be getting a decorator parameter that allows fields to be reordered. This is certainly a promising development, though development on this feature seems to have stalled somewhat.

    Right now, you can get this behaviour, plus some other niceties, by using dataclassy, my reimplementation of dataclasses that overcomes frustrations like this. Using from dataclassy in place of from dataclasses in the original example means it runs without errors.

    Using inspect to print the signature of Child makes what is going on clear; the result is (name: str, age: int, school: str, ugly: bool = True). Fields are always reordered so that fields with default values come after fields without them in the parameters to the initializer. Both lists (fields without defaults, and those with them) are still ordered in definition order.

    Coming face to face with this issue was one of the factors that prompted me to write a replacement for dataclasses. The workarounds detailed here, while helpful, require code to be contorted to such an extent that they completely negate the readability advantage dataclasses' naive approach (whereby field ordering is trivially predictable) offers.

    0 讨论(0)
  • You can use a modified version of dataclasses, which will generate a keyword only __init__ method:

    import dataclasses
    
    
    def _init_fn(fields, frozen, has_post_init, self_name):
        # fields contains both real fields and InitVar pseudo-fields.
        globals = {'MISSING': dataclasses.MISSING,
                   '_HAS_DEFAULT_FACTORY': dataclasses._HAS_DEFAULT_FACTORY}
    
        body_lines = []
        for f in fields:
            line = dataclasses._field_init(f, frozen, globals, self_name)
            # line is None means that this field doesn't require
            # initialization (it's a pseudo-field).  Just skip it.
            if line:
                body_lines.append(line)
    
        # Does this class have a post-init function?
        if has_post_init:
            params_str = ','.join(f.name for f in fields
                                  if f._field_type is dataclasses._FIELD_INITVAR)
            body_lines.append(f'{self_name}.{dataclasses._POST_INIT_NAME}({params_str})')
    
        # If no body lines, use 'pass'.
        if not body_lines:
            body_lines = ['pass']
    
        locals = {f'_type_{f.name}': f.type for f in fields}
        return dataclasses._create_fn('__init__',
                          [self_name, '*'] + [dataclasses._init_param(f) for f in fields if f.init],
                          body_lines,
                          locals=locals,
                          globals=globals,
                          return_type=None)
    
    
    def add_init(cls, frozen):
        fields = getattr(cls, dataclasses._FIELDS)
    
        # Does this class have a post-init function?
        has_post_init = hasattr(cls, dataclasses._POST_INIT_NAME)
    
        # Include InitVars and regular fields (so, not ClassVars).
        flds = [f for f in fields.values()
                if f._field_type in (dataclasses._FIELD, dataclasses._FIELD_INITVAR)]
        dataclasses._set_new_attribute(cls, '__init__',
                           _init_fn(flds,
                                    frozen,
                                    has_post_init,
                                    # The name to use for the "self"
                                    # param in __init__.  Use "self"
                                    # if possible.
                                    '__dataclass_self__' if 'self' in fields
                                    else 'self',
                                    ))
    
        return cls
    
    
    # a dataclass with a constructor that only takes keyword arguments
    def dataclass_keyword_only(_cls=None, *, repr=True, eq=True, order=False,
                  unsafe_hash=False, frozen=False):
        def wrap(cls):
            cls = dataclasses.dataclass(
                cls, init=False, repr=repr, eq=eq, order=order, unsafe_hash=unsafe_hash, frozen=frozen)
            return add_init(cls, frozen)
    
        # See if we're being called as @dataclass or @dataclass().
        if _cls is None:
            # We're called with parens.
            return wrap
    
        # We're called as @dataclass without parens.
        return wrap(_cls)
    

    (also posted as a gist, tested with Python 3.6 backport)

    This will require to define the child class as

    @dataclass_keyword_only
    class Child(Parent):
        school: str
        ugly: bool = True
    

    And would generate __init__(self, *, name:str, age:int, ugly:bool=True, school:str) (which is valid python). The only caveat here is not allowing to initialize objects with positional arguments, but otherwise it's a completely regular dataclass with no ugly hacks.

    0 讨论(0)
  • 2020-11-28 22:44

    The way dataclasses combines attributes prevents you from being able to use attributes with defaults in a base class and then use attributes without a default (positional attributes) in a subclass.

    That's because the attributes are combined by starting from the bottom of the MRO, and building up an ordered list of the attributes in first-seen order; overrides are kept in their original location. So Parent starts out with ['name', 'age', 'ugly'], where ugly has a default, and then Child adds ['school'] to the end of that list (with ugly already in the list). This means you end up with ['name', 'age', 'ugly', 'school'] and because school doesn't have a default, this results in an invalid argument listing for __init__.

    This is documented in PEP-557 Dataclasses, under inheritance:

    When the Data Class is being created by the @dataclass decorator, it looks through all of the class's base classes in reverse MRO (that is, starting at object) and, for each Data Class that it finds, adds the fields from that base class to an ordered mapping of fields. After all of the base class fields are added, it adds its own fields to the ordered mapping. All of the generated methods will use this combined, calculated ordered mapping of fields. Because the fields are in insertion order, derived classes override base classes.

    and under Specification:

    TypeError will be raised if a field without a default value follows a field with a default value. This is true either when this occurs in a single class, or as a result of class inheritance.

    You do have a few options here to avoid this issue.

    The first option is to use separate base classes to force fields with defaults into a later position in the MRO order. At all cost, avoid setting fields directly on classes that are to be used as base classes, such as Parent.

    The following class hierarchy works:

    # base classes with fields; fields without defaults separate from fields with.
    @dataclass
    class _ParentBase:
        name: str
        age: int
    
    @dataclass
    class _ParentDefaultsBase:
        ugly: bool = False
    
    @dataclass
    class _ChildBase(_ParentBase):
        school: str
    
    @dataclass
    class _ChildDefaultsBase(_ParentDefaultsBase):
        ugly: bool = True
    
    # public classes, deriving from base-with, base-without field classes
    # subclasses of public classes should put the public base class up front.
    
    @dataclass
    class Parent(_ParentDefaultsBase, _ParentBase):
        def print_name(self):
            print(self.name)
    
        def print_age(self):
            print(self.age)
    
        def print_id(self):
            print(f"The Name is {self.name} and {self.name} is {self.age} year old")
    
    @dataclass
    class Child(Parent, _ChildDefaultsBase, _ChildBase):
        pass
    

    By pulling out fields into separate base classes with fields without defaults and fields with defaults, and a carefully selected inheritance order, you can produce an MRO that puts all fields without defaults before those with defaults. The reversed MRO (ignoring object) for Child is:

    _ParentBase
    _ChildBase
    _ParentDefaultsBase
    _ChildDefaultsBase
    Parent
    

    Note that Parent doesn't set any new fields, so it doesn't matter here that it ends up 'last' in the field listing order. The classes with fields without defaults (_ParentBase and _ChildBase) precede the classes with fields with defaults (_ParentDefaultsBase and _ChildDefaultsBase).

    The result is Parent and Child classes with a sane field older, while Child is still a subclass of Parent:

    >>> from inspect import signature
    >>> signature(Parent)
    <Signature (name: str, age: int, ugly: bool = False) -> None>
    >>> signature(Child)
    <Signature (name: str, age: int, school: str, ugly: bool = True) -> None>
    >>> issubclass(Child, Parent)
    True
    

    and so you can create instances of both classes:

    >>> jack = Parent('jack snr', 32, ugly=True)
    >>> jack_son = Child('jack jnr', 12, school='havard', ugly=True)
    >>> jack
    Parent(name='jack snr', age=32, ugly=True)
    >>> jack_son
    Child(name='jack jnr', age=12, school='havard', ugly=True)
    

    Another option is to only use fields with defaults; you can still make in an error to not supply a school value, by raising one in __post_init__:

    _no_default = object()
    
    @dataclass
    class Child(Parent):
        school: str = _no_default
        ugly: bool = True
    
        def __post_init__(self):
            if self.school is _no_default:
                raise TypeError("__init__ missing 1 required argument: 'school'")
    

    but this does alter the field order; school ends up after ugly:

    <Signature (name: str, age: int, ugly: bool = True, school: str = <object object at 0x1101d1210>) -> None>
    

    and a type hint checker will complain about _no_default not being a string.

    You can also use the attrs project, which was the project that inspired dataclasses. It uses a different inheritance merging strategy; it pulls overridden fields in a subclass to the end of the fields list, so ['name', 'age', 'ugly'] in the Parent class becomes ['name', 'age', 'school', 'ugly'] in the Child class; by overriding the field with a default, attrs allows the override without needing to do a MRO dance.

    attrs supports defining fields without type hints, but lets stick to the supported type hinting mode by setting auto_attribs=True:

    import attr
    
    @attr.s(auto_attribs=True)
    class Parent:
        name: str
        age: int
        ugly: bool = False
    
        def print_name(self):
            print(self.name)
    
        def print_age(self):
            print(self.age)
    
        def print_id(self):
            print(f"The Name is {self.name} and {self.name} is {self.age} year old")
    
    @attr.s(auto_attribs=True)
    class Child(Parent):
        school: str
        ugly: bool = True
    
    0 讨论(0)
提交回复
热议问题