Subclassing Tuple with Multiple _Init_ Arguments

Subclassing tuple with multiple init arguments

Because tuples are immutable, you have to override __new__ instead:

python docs

object.__new__(cls[, ...])

Called to create a new instance of
class cls. __new__() is a static
method (special-cased so you need not
declare it as such) that takes the
class of which an instance was
requested as its first argument. The
remaining arguments are those passed
to the object constructor expression
(the call to the class). The return
value of __new__() should be the new
object instance (usually an instance
of cls).

Typical implementations create a new
instance of the class by invoking the
superclass’s __new__() method using
super(currentclass, cls).__new__(cls[, ...]) with appropriate arguments and
then modifying the newly-created
instance as necessary before returning
it.

If __new__() returns an instance of
cls, then the new instance’s
__init__() method will be invoked like __init__(self[, ...]), where self is the new instance and the remaining
arguments are the same as were passed
to __new__().

If __new__() does not return an
instance of cls, then the new
instance’s __init__() method will not
be invoked.

__new__() is intended mainly to allow subclasses of immutable types (like
int, str, or tuple) to customize
instance creation. It is also commonly
overridden in custom metaclasses in
order to customize class creation.

Understanding Python super() with init() methods

super() lets you avoid referring to the base class explicitly, which can be nice. But the main advantage comes with multiple inheritance, where all sorts of fun stuff can happen. See the standard docs on super if you haven't already.

Note that the syntax changed in Python 3.0: you can just say super().__init__() instead of super(ChildB, self).__init__() which IMO is quite a bit nicer. The standard docs also refer to a guide to using super() which is quite explanatory.

How is `tuple.init` different from `super().init` in a subclass of tuple?

If you call tuple.__init__ it returns object.__init__ because tuple has no custom __init__ method and only inherits it from object. The first argument for object.__init__ is self and what object.__init__ does is nothing. So when you pass in contents it's interpreted as self and doesn't throw an Exception. However it probably doesn't do what you think it does because tuple.__new__ is responsible for setting up a new tuple instance.

If you use super().__init__ it also resolves to object.__init__ but it already binds the current "self" as first argument. So when you pass contents to this function it's interpreted as additional argument which doesn't exist for object.__init__ and therefore throws that Error.

Inheritance and Overriding init in python

The book is a bit dated with respect to subclass-superclass calling. It's also a little dated with respect to subclassing built-in classes.

It looks like this nowadays:

class FileInfo(dict):
    """store file metadata"""
    def __init__(self, filename=None):
        super(FileInfo, self).__init__()
        self["name"] = filename

Note the following:

We can directly subclass built-in classes, like dict, list, tuple, etc.
The super function handles tracking down this class's superclasses and calling functions in them appropriately.

How to overload init method based on argument type?

A much neater way to get 'alternate constructors' is to use classmethods. For instance:

>>> class MyData:
...     def __init__(self, data):
...         "Initialize MyData from a sequence"
...         self.data = data
...     
...     @classmethod
...     def fromfilename(cls, filename):
...         "Initialize MyData from a file"
...         data = open(filename).readlines()
...         return cls(data)
...     
...     @classmethod
...     def fromdict(cls, datadict):
...         "Initialize MyData from a dict's items"
...         return cls(datadict.items())
... 
>>> MyData([1, 2, 3]).data
[1, 2, 3]
>>> MyData.fromfilename("/tmp/foobar").data
['foo\n', 'bar\n', 'baz\n']
>>> MyData.fromdict({"spam": "ham"}).data
[('spam', 'ham')]

The reason it's neater is that there is no doubt about what type is expected, and you aren't forced to guess at what the caller intended for you to do with the datatype it gave you. The problem with isinstance(x, basestring) is that there is no way for the caller to tell you, for instance, that even though the type is not a basestring, you should treat it as a string (and not another sequence.) And perhaps the caller would like to use the same type for different purposes, sometimes as a single item, and sometimes as a sequence of items. Being explicit takes all doubt away and leads to more robust and clearer code.

What is a clean pythonic way to implement multiple constructors?

Actually None is much better for "magic" values:

class Cheese():
    def __init__(self, num_holes = None):
        if num_holes is None:
            ...

Now if you want complete freedom of adding more parameters:

class Cheese():
    def __init__(self, *args, **kwargs):
        #args -- tuple of anonymous arguments
        #kwargs -- dictionary of named arguments
        self.num_holes = kwargs.get('num_holes',random_holes())

To better explain the concept of *args and **kwargs (you can actually change these names):

def f(*args, **kwargs):
   print 'args: ', args, ' kwargs: ', kwargs

>>> f('a')
args:  ('a',)  kwargs:  {}
>>> f(ar='a')
args:  ()  kwargs:  {'ar': 'a'}
>>> f(1,2,param=3)
args:  (1, 2)  kwargs:  {'param': 3}

http://docs.python.org/reference/expressions.html#calls

Why isn't my class initialized by def int or def _init_? Why do I get a takes no arguments TypeError, or an AttributeError?

What do the exception messages mean, and how do they relate to the problem?

As one might guess, a TypeError is an Error that has to do with the Type of something. In this case, the meaning is a bit strained: Python also uses this error type for function calls where the arguments (the things you put in between () in order to call a function, class constructor or other "callable") cannot be properly assigned to the parameters (the things you put between () when writing a function using the def syntax).

In the examples where a TypeError occurs, the class constructor for Example does not take arguments. Why? Because it is using the base object constructor, which does not take arguments. That is just following the normal rules of inheritance: there is no __init__ defined locally, so the one from the superclass - in this case, object - is used.

Similarly, an AttributeError is an Error that has to do with the Attributes of something. This is quite straightforward: the instance of Example doesn't have any .attribute attribute, because the constructor (which, again, comes from object due to the typo) did not set one.

Why didn't a problem occur earlier, for example, with the class definition itself?

Because the method with a wrongly typed name is still syntactically valid. Only syntax errors (reported as SyntaxError; yes, it's an exception, and yes, there are valid uses for it in real programs) can be caught before the code runs. Python does not assign any special meaning to methods named _init_ (with one underscore on each side), so it does not care what the parameters are. While __int__ is used for converting instances of the class to integer, and shouldn't have any parameters besides self, it is still syntactically valid.

Your IDE might be able to warn you about an __int__ method that takes suspicious parameters (i.e., anything besides self). However, a) that doesn't completely solve the problem (see below), and b) the IDE might have helped you get it wrong in the first place (by making a bad autocomplete suggestion).

The _init_ typo seems to be much less common nowadays. My guess is that people used to do this after reading example code out of books with poor typesetting.

How else might the problem manifest?

In the case where an instance is successfully created (but not properly initialized), any kind of problem could potentially happen later (depending on why proper initialization was needed). For example:

BOMB_IS_SET = True
class DefusalExpert():
    def __int__(self):
        global BOMB_IS_SET
        BOMB_IS_SET = False
    def congratulate(self):
        global BOMB_IS_SET
        if BOMB_IS_SET:
            raise RuntimeError("everything blew up, gg")
        else:
            print("hooray!")

If you intend for the class to be convertible to integer and also wrote __int__ deliberately, the last one will take precedence:

class LoneliestNumber:
    def __int__(self):
        return 1
    def __int__(self): # was supposed to be __init__
        self.two = "can be as bad"

>>> int(LoneliestNumber())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __int__ returned non-int (type NoneType)

(Note that __int__ will not be used implicitly to convert instances of the class to an index for a list or tuple. That's done by __index__.)

How might I guard against the problem in the future?

There is no magic bullet. I find it helps a little to have the convention of always putting __init__ (and/or __new__) as the first method in a class, if the class needs one. However, there is no substitute for proofreading, or for training.

Python overriding init

In Python two methods are called when a object is created. __new__ and __init__. Like many classes implemented in C, random.Random uses __new__ to initialize itself (see random_new). You have to overwrite it and call it with the appropriate parameters:

import random

class MyRand(random.Random):
    def __new__(cls, myvar1, myvar2, x=None):
        return random.Random.__new__(cls, x)

    def __init__(self, myvar1, myvar2, x=None):
        # ( ... my code ...)

What is the difference between init and init__ in python class?

__init__ is the hook used to initialize your instance. (it is always called when you create an instance).

init__ is just a class method with a wonky name.

You need to show us your code; if something is broken when you have a method named __init__ you made a mistake there. Renaming it to init__ just means it won't be called automatically, thus not triggering your coding mistake.

In the comment you refer to, the author did use __init__ in his comment but forgot to escape the leading underscores, and they were interpreted as code to start bold text instead:

__this is bold__

becomes this is bold. Note that the trailing __ on __main__ also is lost in that comment.

In your updated code, you are trying to override the __init__ method of a (subclass) of tuple, which is a special case. By renaming the __init__ method to init__ you created a different method and did not run into this common problem at all.

See Subclassing Python tuple with multiple __init__ arguments for more detail on why this is a problem; you have to create a __new__ method instead. __new__ is the factory method to create instances, __init__ then initializes the data. But that doesn't work on immutable types such as namedtuple or tuple, since you are not supposed to change the instance after the factory created it.

In this specific case, you do not need an __init__ or a __new__ method at all because the namedtuple subclass Point already takes care of the initialization of the x, y and z attributes. By renaming __init__ to init__ you made things work, but you do end up with a pointless init__ method that you'll never use. Just delete it.

Subclassing Tuple with Multiple _Init_ Arguments