Why Aren't Superclass _Init_ Methods Automatically Invoked

Why aren't superclass __init__ methods automatically invoked?

The crucial distinction between Python's __init__ and those other languages constructors is that __init__ is not a constructor: it's an initializer (the actual constructor (if any, but, see later;-) is __new__ and works completely differently again). While constructing all superclasses (and, no doubt, doing so "before" you continue constructing downwards) is obviously part of saying you're constructing a subclass's instance, that is clearly not the case for initializing, since there are many use cases in which superclasses' initialization needs to be skipped, altered, controlled -- happening, if at all, "in the middle" of the subclass initialization, and so forth.

Basically, super-class delegation of the initializer is not automatic in Python for exactly the same reasons such delegation is also not automatic for any other methods -- and note that those "other languages" don't do automatic super-class delegation for any other method either... just for the constructor (and if applicable, destructor), which, as I mentioned, is not what Python's __init__ is. (Behavior of __new__ is also quite peculiar, though really not directly related to your question, since __new__ is such a peculiar constructor that it doesn't actually necessarily need to construct anything -- could perfectly well return an existing instance, or even a non-instance... clearly Python offers you a lot more control of the mechanics than the "other languages" you have in mind, which also includes having no automatic delegation in __new__ itself!-).

Should __init__() call the parent class's __init__()?

In Python, calling the super-class' __init__ is optional. If you call it, it is then also optional whether to use the super identifier, or whether to explicitly name the super class:

object.__init__(self)

In case of object, calling the super method is not strictly necessary, since the super method is empty. Same for __del__.

On the other hand, for __new__, you should indeed call the super method, and use its return as the newly-created object - unless you explicitly want to return something different.

Python: Inherit the superclass __init__

super(SubClass, self).__init__(...)

Consider using *args and **kw if it helps solving your variable nightmare.

Do I need to pass attributes from superclass's __init__ to subclass's __init__ manually, even if I use super()?

While your wording wasn't quite right (the arguments are actually passed from the subclass to the superclass), the answer is yes, you need to manually pass the arguments in for super().__init__(*args, **kwargs). Otherwise you will encounter a TypeError complaining about the missing required positional/keyword arguments.

In your example, it seemed unnecessary but you trimmed a key line after the super().__init__():

class ElectricCar(Car):
def __init__(self, make, model, year):
super().__init__(make, model, year)
self.battery = Battery()

This makes it necessary to redefine the __init__ in your subclass ElectricCar, as Car doesn't have the attribute battery.

Note that for immutable superclass however, you don't need to pass in any argument:

class Foo(int):
def __init__(self, value):
self.value = value
super().__init__()

It would just take your subclass arguments as is at the point of __new__(). In this case, if you did manually pass in any argument, to the super().__init(), the interpreter would complain that the superclass's __init__ method doesn't take any argument. You can quickly see why this seems useless, but here's a related question on properly inheriting from str/int (if necessary).

What does 'super' do in Python? - difference between super().__init__() and explicit superclass __init__()

The benefits of super() in single-inheritance are minimal -- mostly, you don't have to hard-code the name of the base class into every method that uses its parent methods.

However, it's almost impossible to use multiple-inheritance without super(). This includes common idioms like mixins, interfaces, abstract classes, etc. This extends to code that later extends yours. If somebody later wanted to write a class that extended Child and a mixin, their code would not work properly.

__init__ not called when subcalssing dict and something else

Your misunderstanding is clear from this comment:

I don't override __init__ in LockableDict, so it should generate an __init__ making automatic calls to base classes' __init__s, shouldn't it?

No!

First, nothing gets automatically generated; the method resolution happens at call time.

And at call time, Python won't call every base class's __init__, it will only call the first one it finds.*

This is why super exists. If you don't explicitly call super in an override, no base classes or even sibling classes will get their implementations called. And dict.__init__ doesn't call its super.

Neither does your Lockable.__init__. So, this means that reversing the order makes sure Lockable.__init__ gets called… but dict.__init__ now doesn't get called.


So, what do you actually want to do? Well, Python's MRO algorithm is designed to allow maximum flexibility for hierarchies of cooperating classes, with any non-cooperating classes thrown in at the end. So, you could do this:

class Lockable(object):

def __init__(self):
super().__init__() # super(Lockable, self).__init__() for 2.x
self._lock = None

def is_locked(self):
return self._lock is None

class LockableDict(Lockable, dict):
pass

Also notice that this allows you to pass initializer arguments along the chain through all your cooperating classes, and then just call the unfriendly class's __init__ with the arguments you know it needs.

In a few cases, you have to put an unfriendly class first.** In that case, you have to explicitly call around them:

class LockableDict(dict, Lockable):
def __init__(self):
dict.__init__(self)
Lockable.__init__(self)

But fortunately, this doesn't come up very often.


* In standard method resolution order, which is slightly more complicated than you'd expect. The same MRO rule is applied when you call super to find the "next" class—which may be a base class, or a sibling, or even a sibling of a subclass. You may find the Wikipedia article on C3 linearization more approachable.

** Especially for built-in classes, because they're special in a few ways that aren't worth getting into here. Then again, many built-in classes don't actually do anything in their __init__, and instead do all the initialization inside the __new__ constructor.

When inheriting directly from `object`, should I call super().__init__()?

Should classes that inherit directly from object call super().__init__()?

You seem to search a simple "yes or no" answer for this question, but unfortunately the answer is "it depends". Furthermore, when deciding if you should call super().__init__(), it is somewhat irrelevant whether or not a class inherits directly from object. What is invariant is that, if object.__init__ is called, it should be called without arguments - since object.__init__ does not accept arguments.

Practically, in cooperative inheritance situations, this means you must ensure that all arguments are consumed before object.__init__ gets invoked. It does not mean you should try to avoid object.__init__ being invoked. Here is an example of consuming args before invoking super, the response and request context has been popped out of the mutable mapping kwargs.

I mentioned earlier that whether or not a class inherits directly from object is a red herring1. But I didn't mention yet what should motivate this design decision: You should call super init [read: super anymethod] if you want the MRO to continue to be searched for other initializers [read: other anymethods]. You should not invoke super if you want to indicate the MRO search should be stopped here.

Why does object.__init__ exist at all, if it doesn't do anything? Because it does do something: ensures it was called without arguments. The presence of arguments likely indicates a bug2. object also serves the purpose of stopping the chain of super calls - somebody has to not call super, otherwise we recurse infinitely. You can stop it explicitly yourself, earlier, by not invoking super. If you don't, object will serve as the final link and stop the chain for you.

Class MRO is determined at compile time, which is generally when a class is defined / when the module is imported. However, note that the use of super involves many chances for runtime branching. You have to consider:

  • Which arguments a method of the super is called with (i.e. which arguments you want to forward along the MRO)
  • Whether or not super is called at all (sometimes you want to intentionally break the chain)
  • Which arguments, if any, the super itself is created with (there is an advanced use case described below)
  • Whether to call a proxied method before or after the code in your own method (put super first if need to access some state the proxied methods set up, put super last if you're setting up some state that the proxied methods rely on being there already - in some cases you even want to put super in the middle somewhere!)
  • The question asks mostly about __init__, but don't forget that super can be used with any other method, too

In rare circumstances, you might conditionally invoke a super call. You might check whether your super() instance has this or that attribute, and base some logic around the result. Or, you might invoke super(OtherClass, self) to explicitly "step over" a link and manually traverse the MRO for this section. Yes, if the default behaviour is not what you wanted, you can hijack the MRO! What all these diabolical ideas have in common is an understanding of the C3 linearization algorithm, how Python makes an MRO, and how super itself uses the MRO. Python's implementation was more or less lifted from another programming language, where super was named next-method. Honestly super is a super-bad name in Python because it causes a common misconception amongst beginners that you're always invoking "up" to one of the parent classes, I wish they had chosen a better name.

When defining an inheritance hierarchy, the interpreter can not know whether you wanted to reuse some other classes existing functionality or to replace it with an alternate implementation, or something else. Either decision could be a valid and practical design. If there was a hard and fast rule about when and how super should be invoked, it would not be left to the programmer to choose - the language would take the decision out of your hands and just do the right thing automatically. I hope that sufficiently explains that invoking super in __init__ is not a simple yes/no question.

If yes, how would you correctly initialize SuperFoo?

(Source for Foo, SuperFoo etc in this revision of the question)

For the purposes of answering this part, I will assume the __init__ methods shown in the MCVE actually need to do some initialization (perhaps you could add placeholder comments in the question's MCVE code to that effect). Don't define an __init__ at all if the only you do is call super with same arguments, there's no point. Don't define an __init__ that's just pass, unless you intentionally mean to halt the MRO traversal there (in which case a comment is certainly warranted!).

Firstly, before we discuss the SuperFoo, let me say that NoSuperFoo looks like an incomplete or bad design. How do you pass the foo argument to Foo's init? The foo init value of 3 was hardcoded. It might be OK to hardcode (or otherwise automatically determine) foo's init value, but then you should probably be doing composition not inheritance.

As for SuperFoo, it inherits SuperCls and Foo. SuperCls looks intended for inheritance, Foo does not. That means you may have some work to do, as pointed out in super harmful. One way forward, as discussed in Raymond's blog, is writing adapters.

class FooAdapter:
def __init__(self, **kwargs):
foo_arg = kwargs.pop('foo')
# can also use kwargs['foo'] if you want to leave the responsibility to remove 'foo' to someone else
# can also use kwargs.pop('foo', 'foo-default') if you want to make this an optional argument
# can also use kwargs.get('foo', 'foo-default') if you want both of the above
self._the_foo_instance = Foo(foo_arg)
super().__init__(**kwargs)

# add any methods, wrappers, or attribute access you need

@property
def foo():
# or however you choose to expose Foo functionality via the adapter
return self._the_foo_instance.foo

Note that FooAdapter has a Foo, not FooAdapter is a Foo. This is not the only possible design choice. However, if you are inheriting like class FooParent(Foo), then you're implying a FooParent is a Foo, and can be used in any place where a Foo would otherwise be - it's often easier to avoid violations of LSP by using composition. SuperCls should also cooperate by allowing **kwargs:

class SuperCls:
def __init__(self, **kwargs):
# some other init code here
super().__init__(**kwargs)

Maybe SuperCls is also out of your control and you have to adapt it too, so be it. The point is, this is a way to re-use code, by adjusting the interfaces so that the signatures are matching. Assuming everyone is cooperating well and consuming what they need, eventually super().__init__(**kwargs) will proxy to object.__init__(**{}).

Since 99% of classes I've seen don't use **kwargs in their constructor, does that mean 99% of python classes are implemented incorrectly?

No, because YAGNI. Do 99% of classes need to immediately support 100% general dependency-injection with all the bells and whistles, before they are useful? Are they broken if they don't? As an example, consider the OrderedCounter recipe given in the collections docs. Counter.__init__ accepts *args and **kwargs, but doesn't proxy them in the super init call. If you wanted to use one of those arguments, well tough luck, you've got to override __init__ and intercept them. OrderedDict isn't defined cooperatively at all, really, some parent calls are hardcoded to dict - and the __init__ of anything next in line isn't invoked, so any MRO traversal would be stopped in its tracks there. If you accidentally defined it as OrderedCounter(OrderedDict, Counter) instead of OrderedCounter(Counter, OrderedDict) the metaclass bases would still be able to create a consistent MRO, but the class just wouldn't work at all as an ordered counter.

In spite of all these shortcomings, the OrderedCounter recipe works as advertised, because the MRO is traversed as designed for the intended use-case. So, you don't even need to do cooperative inheritance 100% correctly in order to implement a dependency-injection. The moral of the story is that perfection is the enemy of progress (or, practicality beats purity). If you want to cram MyWhateverClass into any crazy inheritance tree you can dream up, go ahead, but it is up to you to write the necessary scaffolding to allow that. As usual, Python will not prevent you to implement it in whatever hacky way is good enough to work.

1You're always inheriting from object, whether you wrote it in the class declaration or not. Many open source codebases will inherit from object explicitly anyway in order to be cross-compatible with 2.7 runtimes.

2This point is explained in greater detail, along with the subtle relationship between __new__ and __init__, in CPython sources here.

Understanding __init_subclass__

__init_subclass__ and __set_name__ are orthogonal mechanisms - they're not tied to each other, just described in the same PEP. Both are features that needed a full-featured metaclass before. The PEP 487 addresses two of the most common uses of metaclasses:

  • how to let the parent know when it is being subclassed (__init_subclass__)
  • how to let a descriptor class know the name of the property it is used for (__set_name__)

As PEP 487 says:

While there are many possible ways to use a metaclass, the vast majority of use cases falls into just three categories: some initialization code running after class creation, the initialization of descriptors and keeping the order in which class attributes were defined.

The first two categories can easily be achieved by having simple hooks into the class creation:

  • An __init_subclass__ hook that initializes all subclasses of a given class.
  • upon class creation, a __set_name__ hook is called on all the attribute (descriptors) defined in the class, and

The third category is the topic of another PEP, PEP 520.

Notice also, that while __init_subclass__ is a replacement for using a metaclass in this class's inheritance tree, __set_name__ in a descriptor class is a replacement for using a metaclass for the class that has an instance of the descriptor as an attribute.

Python (and Python C API): __new__ versus __init__

The difference mainly arises with mutable vs immutable types.

__new__ accepts a type as the first argument, and (usually) returns a new instance of that type. Thus it is suitable for use with both mutable and immutable types.

__init__ accepts an instance as the first argument and modifies the attributes of that instance. This is inappropriate for an immutable type, as it would allow them to be modified after creation by calling obj.__init__(*args).

Compare the behaviour of tuple and list:

>>> x = (1, 2)
>>> x
(1, 2)
>>> x.__init__([3, 4])
>>> x # tuple.__init__ does nothing
(1, 2)
>>> y = [1, 2]
>>> y
[1, 2]
>>> y.__init__([3, 4])
>>> y # list.__init__ reinitialises the object
[3, 4]

As to why they're separate (aside from simple historical reasons): __new__ methods require a bunch of boilerplate to get right (the initial object creation, and then remembering to return the object at the end). __init__ methods, by contrast, are dead simple, since you just set whatever attributes you need to set.

Aside from __init__ methods being easier to write, and the mutable vs immutable distinction noted above, the separation can also be exploited to make calling the parent class __init__ in subclasses optional by setting up any absolutely required instance invariants in __new__. This is generally a dubious practice though - it's usually clearer to just call the parent class __init__ methods as necessary.

Inheritance and Overriding __init__ in python

The book is a bit dated with respect to subclass-superclass calling. It's also a little dated with respect to subclassing built-in classes.

It looks like this nowadays:

class FileInfo(dict):
"""store file metadata"""
def __init__(self, filename=None):
super(FileInfo, self).__init__()
self["name"] = filename

Note the following:

  1. We can directly subclass built-in classes, like dict, list, tuple, etc.

  2. The super function handles tracking down this class's superclasses and calling functions in them appropriately.



Related Topics



Leave a reply



Submit