Subclassing int in Python
int
is immutable so you can't modify it after it is created, use __new__
instead
class TestClass(int):
def __new__(cls, *args, **kwargs):
return super(TestClass, cls).__new__(cls, 5)
print TestClass()
python: subclass of subclass of int
Comments have supplied info that has helped answer Why the difference? and What's the best way?
First: Why the difference?
In the original definitions of uint16
and Offset16
, the __new__
method uses super(cls,cls)
. As @juanpa.arrivillaga pointed out, when Offset16.__new__
is call it leads to uint16.__new__
calling itself recursively. By having Offset16.__new__
use super(uint16,cls)
, it changes the behaviour inside uint16.__new__
.
Some additional explanation may help to understand:
The cls
argument passed into Offset16.__new__
is the Offset16
class itself. So, when the implementation of the method refers to cls
, that is a reference to Offset16
. So,
return super(cls, cls).__new__(cls, val)
is equivalent in that case to
return super(Offset16, Offset16).__new__(Offset16, val)
Now we might think of super
as returning the base class, but its semantics when arguments are provided is more subtle: super
is resolving a reference to a method and the arguments affect how that resolution happens. If no arguments are provided, super().__new__
is the method in the immediate superclass. When arguments are provided, that affects the search. In particular for super(type1, type2)
, the MRO (method resolution order) of type2
will be searched for an occurrence of type1
, and the class following type1 in that sequence will be used.
(This is explained in the documentation of super
, though the wording could be clearer.)
The MRO for Offset16
is (Offset16, uint16, int, object). Therefore
return super(Offset16, Offset16).__new__(Offset16, val)
resolves to
return uint16.__new__(Offset16, val)
When uint16.__new__
is called in this way, the class argument passed to it is Ofset16
, not uint16
. As a result, when its implementation has
return super(cls, cls).__new__(cls, val)
that once again will resolve to
return uint16.__new__(Offset16, val)
This is why we end up with an infinite loop.
But in the changed definition of Offset16
,
class Offset16(uint16):
def __new__(cls, val):
return super(uint16, cls).__new__(cls, val)
the last line is equivalent to
return super(uint16, Offset16).__new__(Offset16, val)
and per the MRO for Offset16
and the semantics for super
mentioned above, that resolves to
return int.__new__(Offset16, val)
That explains why the changed definition results in a different behaviour.
Second: What's the best way to do this?
Different alternatives were provided in comments that might fit different situations.
@juanpa.arrivillaga suggested (assuming Python3) simply using super()
without arguments. For the approach that was being taken in the question, this makes sense. The reason for passing arguments to super
would be to manipulate the MRO search. In this simple class hierarchy, that's not needed.
@Jason Yang suggested referring directly to the specific superclass rather than using super
. For instance:
class Offset16(uint16):
def __new__(cls, val):
return uint16.__new__(cls, val)
That is perfectly fine for this simple situation. But it might not be the best for other scenarios with more complex class relationships. Note, for instance, that uint16
is duplicated in the above. If the subclass had several methods that wrapped (rather than replaced) the superclass method, there would be many duplicate references, and making changes to the class hierarchy would result in hard-to-analyze bugs. Avoiding such problems is one of the intended benefits for using super
.
Finally, @Adam.Er8 suggested simply using
Offset16 = uint16
That's very simple, indeed. The one caveat is that Offset16
is truly no more than an alias for uint16
; it's not a separate class. For example:
>>> Offset16 = uint16
>>> x = Offset16(24)
>>> type(x)
<class 'uint16'>
So, this may be fine so long as there's never a need in the app to have an actual type distinction.
Subclassing int - unexpected behaviour with range
Don't subclass int
and then override all of the methods. If you do that, the base class will think you have one value, and the subclass will think you have a different value. Instead, subclass numbers.Integral
and implement all of the abstract methods. Then you can be sure your implementation is the only game in town.
How to subclass int and make it mutable
Is it possible to subclass int and make it mutable?
Sort of. You can add all the mutable parts you want, but you can't touch the int parts, so the degree of mutability you can add won't help you.
Instead, don't use an int subclass. Use a regular object that stores an int. If you want to be able to pass it to struct.pack
like an int, implement an __index__
method to define how to interpret your object as an int:
class IntLike(object): # not IntLike(int):
def __init__(self, value=0):
self.value = value
def __index__(self):
return self.value
...
You can implement additional methods like __or__
for |
and __ior__
for in-place, mutative |=
. Don't try to push too hard for complete interoperability with ints, though; for example, don't try to make your objects usable as dict keys. They're mutable, after all.
If it's really important to you that your class is an int
subclass, you're going to have to sacrifice the c.sixth_property = True
syntax you want. You'll have to pick an alternative like c = c.with_sixth_property(True)
, and implement things non-mutatively.
Automatic counter as a subclass of integer?
If you really, really, really need to mangle an immutable and built-in type, then you can create a kind-of "pointer" to it:
class AutomaticCounter(int):
def __new__(cls, *args, **kwargs):
# create a new instance of int()
self = super().__new__(cls, *args, **kwargs)
# create a member "ptr" and storing a ref to the instance
self.ptr = self
# return the normal instance
return self
def __str__(self):
# first, create a copy via int()
# which "decays" from your subclass to an ordinary int()
# then stringify it to obtain the normal __str__() value
value = str(int(self.ptr))
# afterwards, store a new instance of your subclass
# that is incremented by 1
self.ptr = AutomaticCounter(self.ptr + 1)
return value
n = AutomaticCounter(0)
print(n) # 0
print(n) # 1
print(n) # 2
# to increment first and print later, use this __str__() instead:
def __str__(self):
self.ptr = AutomaticCounter(self.ptr + 1)
return str(int(self.ptr))
This, however, doesn't make the type immutable per se. If you do print(f"{self=}")
at the beginning of __str__()
you'll see the instance is unchanged, so you effectively have a size of 2x int()
(+ some trash) for your object and you access the real instance via self.ptr
.
It wouldn't work with self
alone as self
is merely a read-only reference (created via __new__()
) passed to instance's methods as the first argument, so something like this:
def func(instance, ...):
instance = <something else>
and you doing the assignment would, as mentioned by Daniel, simply assign a new value to the local variable named instance
(self
is just a quasi-standard name for the reference) which doesn't really change the instance. Therefore the next solution which looks similar would be a pointer and as you'd like to manipulate it the way you described, I "hid" it to a custom member called ptr
.
As pointed out by user2357112, there is a desynchronization caused by the instance being immutable, therefore if you choose the self.ptr
hack, you'll need to update the magic methods (__*__()
), for example this is updating the __add__()
. Notice the int()
calls, it converts it to int()
to prevent recursions.
class AutomaticCounter(int):
def __new__(cls, *args, **kwargs):
self = super().__new__(cls, *args, **kwargs)
self.ptr = self
return self
def __str__(self):
value = int(self.ptr)
self.ptr = AutomaticCounter(int(self.ptr) + 1)
return str(value)
def __add__(self, other):
value = other
if hasattr(other, "ptr"):
value = int(other.ptr)
self.ptr = AutomaticCounter(int(self.ptr) + value)
return int(self.ptr)
def __rmul__(self, other):
# [1, 2, 3] * your_object
return other * int(self.ptr)
n = AutomaticCounter(0)
print(n) # 0
print(n) # 1
print(n) # 2
print(n+n) # 6
However, anything that attempts to pull the raw value or tries to access it with C API will most likely fail, namely reverse operations e.g. with immutable built-ins should be the case as for those you can't edit the magic methods reliably so it's corrected in all modules and all scopes.
Example:
# will work fine because it's your class
a <operator> b -> a.__operator__(b)
vs
# will break everything because it's using the raw value, not self.ptr hack
b <operator> a -> b.__operator__(a)
with exception of list.__mul__()
for some reason. When I find the code line in CPython, I'll add it here.
Or, a more sane solution would be to create a custom and mutable object, create a member in it and manipulate that. Then return it, stringified, in __str__
:
class AutomaticCounter(int):
def __init__(self, start=0):
self.item = start
def __str__(self):
self.item += 1
return str(self.item)
Related Topics
Configuring So That Pip Install Can Work from Github
Looping Over All Member Variables of a Class in Python
How to Make a 4D Plot with Matplotlib Using Arbitrary Data
Reactornotrestartable Error in While Loop with Scrapy
How to Get the Current Time in Milliseconds in Python
Cannot Pass an Argument to Python with "#!/Usr/Bin/Env Python"
Cannot Install Python 3.7 on Osx-Arm64
Super() Raises "Typeerror: Must Be Type, Not Classobj" for New-Style Class
How to Create Full Compressed Tar File Using Python
Python, Https Get with Basic Authentication
Python Extending with - Using Super() Python 3 VS Python 2
Python String 'Join' Is Faster () Than '+', But What's Wrong Here