Class Inheritance in Python 3.7 Dataclasses

Class inheritance in Python 3.7 dataclasses

The way dataclasses combines attributes prevents you from being able to use attributes with defaults in a base class and then use attributes without a default (positional attributes) in a subclass.

That's because the attributes are combined by starting from the bottom of the MRO, and building up an ordered list of the attributes in first-seen order; overrides are kept in their original location. So Parent starts out with ['name', 'age', 'ugly'], where ugly has a default, and then Child adds ['school'] to the end of that list (with ugly already in the list). This means you end up with ['name', 'age', 'ugly', 'school'] and because school doesn't have a default, this results in an invalid argument listing for __init__.

This is documented in PEP-557 Dataclasses, under inheritance:

When the Data Class is being created by the @dataclass decorator, it looks through all of the class's base classes in reverse MRO (that is, starting at object) and, for each Data Class that it finds, adds the fields from that base class to an ordered mapping of fields. After all of the base class fields are added, it adds its own fields to the ordered mapping. All of the generated methods will use this combined, calculated ordered mapping of fields. Because the fields are in insertion order, derived classes override base classes.

and under Specification:

TypeError will be raised if a field without a default value follows a field with a default value. This is true either when this occurs in a single class, or as a result of class inheritance.

You do have a few options here to avoid this issue.

The first option is to use separate base classes to force fields with defaults into a later position in the MRO order. At all cost, avoid setting fields directly on classes that are to be used as base classes, such as Parent.

The following class hierarchy works:

# base classes with fields; fields without defaults separate from fields with.
@dataclass
class _ParentBase:
name: str
age: int

@dataclass
class _ParentDefaultsBase:
ugly: bool = False

@dataclass
class _ChildBase(_ParentBase):
school: str

@dataclass
class _ChildDefaultsBase(_ParentDefaultsBase):
ugly: bool = True

# public classes, deriving from base-with, base-without field classes
# subclasses of public classes should put the public base class up front.

@dataclass
class Parent(_ParentDefaultsBase, _ParentBase):
def print_name(self):
print(self.name)

def print_age(self):
print(self.age)

def print_id(self):
print(f"The Name is {self.name} and {self.name} is {self.age} year old")

@dataclass
class Child(_ChildDefaultsBase, Parent, _ChildBase):
pass

By pulling out fields into separate base classes with fields without defaults and fields with defaults, and a carefully selected inheritance order, you can produce an MRO that puts all fields without defaults before those with defaults. The reversed MRO (ignoring object) for Child is:

_ParentBase
_ChildBase
_ParentDefaultsBase
Parent
_ChildDefaultsBase

Note that while Parent doesn't set any new fields, it does inherit the fields from _ParentDefaultsBase and should not end up 'last' in the field listing order; the above order puts _ChildDefaultsBase last so its fields 'win'. The dataclass rules are also satisfied; the classes with fields without defaults (_ParentBase and _ChildBase) precede the classes with fields with defaults (_ParentDefaultsBase and _ChildDefaultsBase).

The result is Parent and Child classes with a sane field older, while Child is still a subclass of Parent:

>>> from inspect import signature
>>> signature(Parent)
<Signature (name: str, age: int, ugly: bool = False) -> None>
>>> signature(Child)
<Signature (name: str, age: int, school: str, ugly: bool = True) -> None>
>>> issubclass(Child, Parent)
True

and so you can create instances of both classes:

>>> jack = Parent('jack snr', 32, ugly=True)
>>> jack_son = Child('jack jnr', 12, school='havard', ugly=True)
>>> jack
Parent(name='jack snr', age=32, ugly=True)
>>> jack_son
Child(name='jack jnr', age=12, school='havard', ugly=True)

Another option is to only use fields with defaults; you can still make in an error to not supply a school value, by raising one in __post_init__:

_no_default = object()

@dataclass
class Child(Parent):
school: str = _no_default
ugly: bool = True

def __post_init__(self):
if self.school is _no_default:
raise TypeError("__init__ missing 1 required argument: 'school'")

but this does alter the field order; school ends up after ugly:

<Signature (name: str, age: int, ugly: bool = True, school: str = <object object at 0x1101d1210>) -> None>

and a type hint checker will complain about _no_default not being a string.

You can also use the attrs project, which was the project that inspired dataclasses. It uses a different inheritance merging strategy; it pulls overridden fields in a subclass to the end of the fields list, so ['name', 'age', 'ugly'] in the Parent class becomes ['name', 'age', 'school', 'ugly'] in the Child class; by overriding the field with a default, attrs allows the override without needing to do a MRO dance.

attrs supports defining fields without type hints, but lets stick to the supported type hinting mode by setting auto_attribs=True:

import attr

@attr.s(auto_attribs=True)
class Parent:
name: str
age: int
ugly: bool = False

def print_name(self):
print(self.name)

def print_age(self):
print(self.age)

def print_id(self):
print(f"The Name is {self.name} and {self.name} is {self.age} year old")

@attr.s(auto_attribs=True)
class Child(Parent):
school: str
ugly: bool = True

Python: Dataclass that inherits from base Dataclass, how do I upgrade a value from base to the new class?

What you are asking for is realized by the factory method pattern, and can be implemented in python classes straight forwardly using the @classmethod keyword.

Just include a dataclass factory method in your base class definition, like this:

import dataclasses

@dataclasses.dataclass
class Person:
name: str
smell: str = "good"

@classmethod
def from_instance(cls, instance):
return cls(**dataclasses.asdict(instance))

Any new dataclass that inherit from this baseclass can now create instances of each other[1] like this:

@dataclasses.dataclass
class Friend(Person):
def say_hi(self):
print(f'Hi {self.name}')

random_stranger = Person(name = 'Bob', smell='OK')
friend = Friend.from_instance(random_stranger)
print(friend.say_hi())
# "Hi Bob"

[1] It won't work if your child classes introduce new fields with no default values, you try to create parent class instances from child class instances, or your parent class has init-only arguments.

dataclass inheritance: Fields without default values cannot appear after fields with default values

Here is a working solution for python > 3.10

@dataclass(kw_only=True)
class TableMetadata:
"""
- entity: business entity represented by the table
- origin: path / query / url from which data withdrawn
- id: field to be used as ID (unique)
- historicity: full, delta
- upload: should the table be uploaded
"""

entity: str
origin: str
view: str
id: str = None
historicity: str = "full"
upload: bool = True
columns: list = field(default_factory=list)

@dataclass(kw_only=True)
class RestTableMetadata(TableMetadata):
"""
- method: HTTP method to be used
- payloadpath: portion of the response payload to use to build the dataframe
"""

method: str
payloadpath: str = None


Related Topics



Leave a reply



Submit