Why is Python 3.x's super() magic?
The new magic super()
behaviour was added to avoid violating the D.R.Y. (Don't Repeat Yourself) principle, see PEP 3135. Having to explicitly name the class by referencing it as a global is also prone to the same rebinding issues you discovered with super()
itself:
class Foo(Bar):
def baz(self):
return super(Foo, self).baz() + 42
Spam = Foo
Foo = something_else()
Spam().baz() # liable to blow up
The same applies to using class decorators where the decorator returns a new object, which rebinds the class name:
@class_decorator_returning_new_class
class Foo(Bar):
def baz(self):
# Now `Foo` is a *different class*
return super(Foo, self).baz() + 42
The magic super()
__class__
cell sidesteps these issues nicely by giving you access to the original class object.
The PEP was kicked off by Guido, who initially envisioned super
becoming a keyword, and the idea of using a cell to look up the current class was also his. Certainly, the idea to make it a keyword was part of the first draft of the PEP.
However, it was in fact Guido himself who then stepped away from the keyword idea as 'too magical', proposing the current implementation instead. He anticipated that using a different name for super()
could be a problem:
My patch uses an intermediate solution: it assumes you need
__class__
whenever you use a variable named'super'
. Thus, if you (globally)
renamesuper
tosupper
and usesupper
but notsuper
, it won't work
without arguments (but it will still work if you pass it either
__class__
or the actual class object); if you have an unrelated
variable namedsuper
, things will work but the method will use the
slightly slower call path used for cell variables.
So, in the end, it was Guido himself that proclaimed that using a super
keyword did not feel right, and that providing a magic __class__
cell was an acceptable compromise.
I agree that the magic, implicit behaviour of the implementation is somewhat surprising, but super()
is one of the most mis-applied functions in the language. Just take a look at all the misapplied super(type(self), self)
or super(self.__class__, self)
invocations found on the Internet; if any of that code was ever called from a derived class you'd end up with an infinite recursion exception. At the very least the simplified super()
call, without arguments, avoids that problem.
As for the renamed super_
; just reference __class__
in your method as well and it'll work again. The cell is created if you reference either the super
or __class__
names in your method:
>>> super_ = super
>>> class A(object):
... def x(self):
... print("No flipping")
...
>>> class B(A):
... def x(self):
... __class__ # just referencing it is enough
... super_().x()
...
>>> B().x()
No flipping
Python super() arguments: why not super(obj)?
The two-argument form is only needed in Python 2. The reason is that self.__class__
always refers to the "leaf" class in the inheritance tree -- that is, the most specific class of the object -- but when you call super
you need to tell it which implementation is currently being invoked, so it can invoke the next one in the inheritance tree.
Suppose you have:
class A(object):
def foo(self):
pass
class B(A):
def foo(self):
super(self.__class__, self).foo()
class C(B):
def foo(self):
super(self.__class__, self).foo()
c = C()
Note that c.__class__
is C
, always. Now think about what happens if you call c.foo()
.
When you call super(self.__class__, self)
in a method of C, it will be like calling super(C, self)
, which means "call the version of this method inherited by C". That will call B.foo
, which is fine. But when you call super(self.__class__, self)
from B, it's still like calling super(C, self)
, because it's the same self
, so self.__class__
is still C
. The result is that the call in B will again call B.foo
and an infinite recursion occurs.
Of course, what you really want is to be able to call super(classThatDefinedTheImplementationThatIsCurrentlyExecuting, self)
, and that is effectively what the Python 3 super()
does.
In Python 3, you can just do super().foo()
and it does the right thing. It's not clear to me what you mean about super(self)
being a shortcut. In Python 2, it doesn't work for the reason I described above. In Python 3, it would be a "longcut" because you can just use plain super()
instead.
The super(type)
and super(type1, type2)
uses might still be needed occasionally in Python 3, but those were always more esoteric usages for unusual situations.
Why does a classmethod's super need a second argument?
super()
returns a descriptor, and needs two items:
- A starting point from which to search the class hierarchy.
- The argument to bind the returned methods.
For the two argument (and implicit zero-argument *) case the second argument is used to bind to, but if you do not pass in a second argument, super()
cannot invoke the descriptor protocol to bind the returned functions, classmethods, properties or other descriptors. classmethods
are still descriptors and are bound; the bind to a class and not an instance, but super()
does not know how the descriptor will use the context to which you bind.
super()
should not and cannot know that you are looking up a class method instead of a regular method; class methods only differ from regular methods because their .__get__()
method acts differently.
Why are class methods bound? Because when you subclass Foo
but do not override .hello()
, calling Bar.hello()
invokes the Foo.__dict__['hello']
function, binds it to Bar
and your first argument to hello(cls)
will be that subclass, not Foo
.
Without a second argument, super()
returns an unbound object that can manually be bound later on. You can do the binding yourself using the .__get__()
method provided by the super()
instance:
class Bar(Foo):
@classmethod
def hello(cls):
print 'hello, bar'
super(Bar).__get__(cls, None).hello()
super().__get__()
on an instance without a context effectively returns a new super()
instance with the context set. On an instance with a context .__get__()
just returns self
; it is already bound.
* In Python 3, calling super()
without arguments from inside a bound method will use the calling frame to discover, implicitly, what the type and bound object are, so you no longer have to explicitly pass in the type and object arguments in that case. Python 3 actually adds a implicit __class__
closure variable to methods for this purpose. See PEP 3135 and Why is Python 3.x's super() magic?
Does any magic happen when I call `super(some_cls)`?
In both cases, super(A)
gives an unbound super object. When you call __init__()
on that, it's being called with no arguments. When super.__init__
is called with no arguments, the compiler tries to infer the arguments: (from typeobject.c line 7434, latest source)
static int
super_init(PyObject *self, PyObject *args, PyObject *kwds)
{
superobject *su = (superobject *)self;
PyTypeObject *type = NULL;
PyObject *obj = NULL;
PyTypeObject *obj_type = NULL;
if (!_PyArg_NoKeywords("super", kwds))
return -1;
if (!PyArg_ParseTuple(args, "|O!O:super", &PyType_Type, &type, &obj))
return -1;
if (type == NULL) {
/* Call super(), without args -- fill in from __class__
and first local variable on the stack. */
A few lines later: (ibid, line 7465)
f = PyThreadState_GET()->frame;
...
co = f->f_code;
...
if (co->co_argcount == 0) {
PyErr_SetString(PyExc_RuntimeError,
"super(): no arguments");
return -1;
}
When you call super(A)
, this inferring behavior is bypassed because type is not None. When you then call __init__()
on the unbound super - because it isn't bound, this __init__
call isn't proxied - the type argument is None and the compiler attempts to infer. Inside the class definition, the self argument is present and is used for this purpose. Outside, no arguments are available, so the exception is raised.
In other words, super(A)
is not behaving differently depending on where it is called - it's super.__init__()
that's behaving differently, and that's exactly what the documentation suggests.
How is super() in Python 3 implemented?
How is super()
implemented? Here's the code for python3.3:
/* Cooperative 'super' */
typedef struct {
PyObject_HEAD
PyTypeObject *type;
PyObject *obj;
PyTypeObject *obj_type;
} superobject;
static PyMemberDef super_members[] = {
{"__thisclass__", T_OBJECT, offsetof(superobject, type), READONLY,
"the class invoking super()"},
{"__self__", T_OBJECT, offsetof(superobject, obj), READONLY,
"the instance invoking super(); may be None"},
{"__self_class__", T_OBJECT, offsetof(superobject, obj_type), READONLY,
"the type of the instance invoking super(); may be None"},
{0}
};
static void
super_dealloc(PyObject *self)
{
superobject *su = (superobject *)self;
_PyObject_GC_UNTRACK(self);
Py_XDECREF(su->obj);
Py_XDECREF(su->type);
Py_XDECREF(su->obj_type);
Py_TYPE(self)->tp_free(self);
}
static PyObject *
super_repr(PyObject *self)
{
superobject *su = (superobject *)self;
if (su->obj_type)
return PyUnicode_FromFormat(
"<super: <class '%s'>, <%s object>>",
su->type ? su->type->tp_name : "NULL",
su->obj_type->tp_name);
else
return PyUnicode_FromFormat(
"<super: <class '%s'>, NULL>",
su->type ? su->type->tp_name : "NULL");
}
static PyObject *
super_getattro(PyObject *self, PyObject *name)
{
superobject *su = (superobject *)self;
int skip = su->obj_type == NULL;
if (!skip) {
/* We want __class__ to return the class of the super object
(i.e. super, or a subclass), not the class of su->obj. */
skip = (PyUnicode_Check(name) &&
PyUnicode_GET_LENGTH(name) == 9 &&
PyUnicode_CompareWithASCIIString(name, "__class__") == 0);
}
if (!skip) {
PyObject *mro, *res, *tmp, *dict;
PyTypeObject *starttype;
descrgetfunc f;
Py_ssize_t i, n;
starttype = su->obj_type;
mro = starttype->tp_mro;
if (mro == NULL)
n = 0;
else {
assert(PyTuple_Check(mro));
n = PyTuple_GET_SIZE(mro);
}
for (i = 0; i < n; i++) {
if ((PyObject *)(su->type) == PyTuple_GET_ITEM(mro, i))
break;
}
i++;
res = NULL;
/* keep a strong reference to mro because starttype->tp_mro can be
replaced during PyDict_GetItem(dict, name) */
Py_INCREF(mro);
for (; i < n; i++) {
tmp = PyTuple_GET_ITEM(mro, i);
if (PyType_Check(tmp))
dict = ((PyTypeObject *)tmp)->tp_dict;
else
continue;
res = PyDict_GetItem(dict, name);
if (res != NULL) {
Py_INCREF(res);
f = Py_TYPE(res)->tp_descr_get;
if (f != NULL) {
tmp = f(res,
/* Only pass 'obj' param if
this is instance-mode super
(See SF ID #743627)
*/
(su->obj == (PyObject *)
su->obj_type
? (PyObject *)NULL
: su->obj),
(PyObject *)starttype);
Py_DECREF(res);
res = tmp;
}
Py_DECREF(mro);
return res;
}
}
Py_DECREF(mro);
}
return PyObject_GenericGetAttr(self, name);
}
static PyTypeObject *
supercheck(PyTypeObject *type, PyObject *obj)
{
/* Check that a super() call makes sense. Return a type object.
obj can be a class, or an instance of one:
- If it is a class, it must be a subclass of 'type'. This case is
used for class methods; the return value is obj.
- If it is an instance, it must be an instance of 'type'. This is
the normal case; the return value is obj.__class__.
But... when obj is an instance, we want to allow for the case where
Py_TYPE(obj) is not a subclass of type, but obj.__class__ is!
This will allow using super() with a proxy for obj.
*/
/* Check for first bullet above (special case) */
if (PyType_Check(obj) && PyType_IsSubtype((PyTypeObject *)obj, type)) {
Py_INCREF(obj);
return (PyTypeObject *)obj;
}
/* Normal case */
if (PyType_IsSubtype(Py_TYPE(obj), type)) {
Py_INCREF(Py_TYPE(obj));
return Py_TYPE(obj);
}
else {
/* Try the slow way */
PyObject *class_attr;
class_attr = _PyObject_GetAttrId(obj, &PyId___class__);
if (class_attr != NULL &&
PyType_Check(class_attr) &&
(PyTypeObject *)class_attr != Py_TYPE(obj))
{
int ok = PyType_IsSubtype(
(PyTypeObject *)class_attr, type);
if (ok)
return (PyTypeObject *)class_attr;
}
if (class_attr == NULL)
PyErr_Clear();
else
Py_DECREF(class_attr);
}
PyErr_SetString(PyExc_TypeError,
"super(type, obj): "
"obj must be an instance or subtype of type");
return NULL;
}
static PyObject *
super_descr_get(PyObject *self, PyObject *obj, PyObject *type)
{
superobject *su = (superobject *)self;
superobject *newobj;
if (obj == NULL || obj == Py_None || su->obj != NULL) {
/* Not binding to an object, or already bound */
Py_INCREF(self);
return self;
}
if (Py_TYPE(su) != &PySuper_Type)
/* If su is an instance of a (strict) subclass of super,
call its type */
return PyObject_CallFunctionObjArgs((PyObject *)Py_TYPE(su),
su->type, obj, NULL);
else {
/* Inline the common case */
PyTypeObject *obj_type = supercheck(su->type, obj);
if (obj_type == NULL)
return NULL;
newobj = (superobject *)PySuper_Type.tp_new(&PySuper_Type,
NULL, NULL);
if (newobj == NULL)
return NULL;
Py_INCREF(su->type);
Py_INCREF(obj);
newobj->type = su->type;
newobj->obj = obj;
newobj->obj_type = obj_type;
return (PyObject *)newobj;
}
}
static int
super_init(PyObject *self, PyObject *args, PyObject *kwds)
{
superobject *su = (superobject *)self;
PyTypeObject *type = NULL;
PyObject *obj = NULL;
PyTypeObject *obj_type = NULL;
if (!_PyArg_NoKeywords("super", kwds))
return -1;
if (!PyArg_ParseTuple(args, "|O!O:super", &PyType_Type, &type, &obj))
return -1;
if (type == NULL) {
/* Call super(), without args -- fill in from __class__
and first local variable on the stack. */
PyFrameObject *f = PyThreadState_GET()->frame;
PyCodeObject *co = f->f_code;
Py_ssize_t i, n;
if (co == NULL) {
PyErr_SetString(PyExc_SystemError,
"super(): no code object");
return -1;
}
if (co->co_argcount == 0) {
PyErr_SetString(PyExc_SystemError,
"super(): no arguments");
return -1;
}
obj = f->f_localsplus[0];
if (obj == NULL) {
PyErr_SetString(PyExc_SystemError,
"super(): arg[0] deleted");
return -1;
}
if (co->co_freevars == NULL)
n = 0;
else {
assert(PyTuple_Check(co->co_freevars));
n = PyTuple_GET_SIZE(co->co_freevars);
}
for (i = 0; i < n; i++) {
PyObject *name = PyTuple_GET_ITEM(co->co_freevars, i);
assert(PyUnicode_Check(name));
if (!PyUnicode_CompareWithASCIIString(name,
"__class__")) {
Py_ssize_t index = co->co_nlocals +
PyTuple_GET_SIZE(co->co_cellvars) + i;
PyObject *cell = f->f_localsplus[index];
if (cell == NULL || !PyCell_Check(cell)) {
PyErr_SetString(PyExc_SystemError,
"super(): bad __class__ cell");
return -1;
}
type = (PyTypeObject *) PyCell_GET(cell);
if (type == NULL) {
PyErr_SetString(PyExc_SystemError,
"super(): empty __class__ cell");
return -1;
}
if (!PyType_Check(type)) {
PyErr_Format(PyExc_SystemError,
"super(): __class__ is not a type (%s)",
Py_TYPE(type)->tp_name);
return -1;
}
break;
}
}
if (type == NULL) {
PyErr_SetString(PyExc_SystemError,
"super(): __class__ cell not found");
return -1;
}
}
if (obj == Py_None)
obj = NULL;
if (obj != NULL) {
obj_type = supercheck(type, obj);
if (obj_type == NULL)
return -1;
Py_INCREF(obj);
}
Py_INCREF(type);
su->type = type;
su->obj = obj;
su->obj_type = obj_type;
return 0;
}
PyDoc_STRVAR(super_doc,
"super() -> same as super(__class__, <first argument>)\n"
"super(type) -> unbound super object\n"
"super(type, obj) -> bound super object; requires isinstance(obj, type)\n"
"super(type, type2) -> bound super object; requires issubclass(type2, type)\n"
"Typical use to call a cooperative superclass method:\n"
"class C(B):\n"
" def meth(self, arg):\n"
" super().meth(arg)\n"
"This works for class methods too:\n"
"class C(B):\n"
" @classmethod\n"
" def cmeth(cls, arg):\n"
" super().cmeth(arg)\n");
static int
super_traverse(PyObject *self, visitproc visit, void *arg)
{
superobject *su = (superobject *)self;
Py_VISIT(su->obj);
Py_VISIT(su->type);
Py_VISIT(su->obj_type);
return 0;
}
PyTypeObject PySuper_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"super", /* tp_name */
sizeof(superobject), /* tp_basicsize */
0, /* tp_itemsize */
/* methods */
super_dealloc, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_reserved */
super_repr, /* tp_repr */
0, /* tp_as_number */
0, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
0, /* tp_call */
0, /* tp_str */
super_getattro, /* tp_getattro */
0, /* tp_setattro */
0, /* tp_as_buffer */
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
Py_TPFLAGS_BASETYPE, /* tp_flags */
super_doc, /* tp_doc */
super_traverse, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
0, /* tp_iter */
0, /* tp_iternext */
0, /* tp_methods */
super_members, /* tp_members */
0, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
super_descr_get, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
super_init, /* tp_init */
PyType_GenericAlloc, /* tp_alloc */
PyType_GenericNew, /* tp_new */
PyObject_GC_Del, /* tp_free */
};
You can see in the super_init
at some point there is the check type == NULL
and then it raises the error that you see. It is not normal to have NULL
s around, so there's probably a bug somewhere in super
(and note that super
already had bugs in previous releases). At least I'd thought that the cases in which SystemError
is raised should be triggered only due to some "internal" failure of the interpreter or some other C code and not from python code.
Also, this did not happen only to you, you can find a post in which this behaviour is considered a bug.
Why do you need to call super class inside constructor?
Your class inherits from beam.DoFn
. Presumably that class needs to set up some things in its __init__
method, or it won't work properly. Thus, if you override __init__
, you need to call the parent class's __init__
or your instance may not function as intended.
I'd note that your current super
call is actually subtly buggy. It's not appropriate to use self.__class__
as the first argument to super
. You either need to write out the name of the current class explicitly, or not pass any arguments at all (the no-argument form of super
is only valid in Python 3). Using self.__class__
might work for now, but it will break if you subclass PublishFn
any further, and override __init__
again in the grandchild class.
an example about C3
First of all, the form super()
in Python 3 is really the same thing as super(<CurrentClass>, self)
, where the Python compiler provides enough information for super()
to determine what the correct class to use is. So in E.foo()
, super().foo()
can be read as super(E, self).foo()
.
To understand what is going on, you need to look at the class.__mro__
attribute:
This attribute is a tuple of classes that are considered when looking for base classes during method resolution.
It is this tuple that shows you what the C3 Method Resolution Order is for any given class hierarchy. For your class E
, that order is:
>>> E.__mro__
(<class '__main__.E'>, <class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <class 'object'>)
>>> for cls in E.__mro__: # print out just the names, for easier readability.
... print(cls.__name__)
...
E
D
B
C
A
object
The super()
object bases everything off from that ordered sequence of classes. The call
super(SomeClass, self).foo()
results in the following series of steps:
- The
super()
object retrieves theself.__mro__
tuple. super()
locates the index for theSomeClass
class in that tuple.- Accessing the
foo
attribute on thesuper()
object triggers a search for a class that has afoo
attribute on the MRO, starting at the next index after theSomeClass
index. - If the attribute found this way is a descriptor object binds the attribute found this way to
self
. Functions are descriptors, binding produces a bound method, and this is how Python passes in theself
reference when you call a method.
Expressed as simplified Python code that ignores edge cases and other uses for super()
, that would look like:
class Super:
def __init__(self, type_, obj_or_type):
self.mro = obj_or_type.__mro__
self.idx = self.mro.index(type_) + 1
self.obj_or_type = obj_or_type
def __getattr__(self, name):
for cls in self.mro[self.idx:]:
attrs = vars(cls)
if name in attrs:
result = attrs[name]
if hasattr(result, '__get__'):
result = result.__get__(obj_or_type, type(self.obj_or_type))
return result
raise AttributeError(name)
Combining those two pieces of information, you can see what happens when you call e.foo()
:
print('foo in E')
is executed, resulting in foo in Esuper().foo()
is executed, effectively the same thing assuper(E, self).foo()
.- The MRO is searched, starting at the next index past
E
, so atD
(nofoo
attribute), moving on toB
(nofoo
attribute), thenC
(attribute found).C.foo
is returned, bound toself
. C.foo(self)
is called, resulting in foo fo C
- The MRO is searched, starting at the next index past
super(B, self).foo()
is executed.- The MRO is searched, starting at the next index past
B
, so atC
(attribute found).C.foo
is returned, bound toself
. C.foo(self)
is called, resulting in foo fo C
- The MRO is searched, starting at the next index past
super(C, self).foo()
is executed.- The MRO is searched, starting at the next index past
C
, so atA
(attribute found).A.foo
is returned, bound toself
. A.foo(self)
is called, resulting in foo of A
- The MRO is searched, starting at the next index past
Related Topics
How to Capitalize the First Letter of Each Word in a String
Python Max Function Using 'Key' and Lambda Expression
Very Large Matrices Using Python and Numpy
How to Draw Vertical Lines on a Given Plot
How to Implement an Ordered, Default Dict
How to Start a Python File While Windows Starts
Editing Specific Line in Text File in Python
Insert Line at Middle of File with Python
Prepend Line to Beginning of a File
How to Get Attribute of Element from Selenium
How to Convert JSON Data into a Python Object
How to Delete a Character from a String Using Python