How to Add Custom Methods/Attributes to Built-In Python Types

Can I add custom methods/attributes to built-in Python types?

You can't directly add the method to the original type. However, you can subclass the type then substitute it in the built-in/global namespace, which achieves most of the effect desired. Unfortunately, objects created by literal syntax will continue to be of the vanilla type and won't have your new methods/attributes.

Here's what it looks like

# Built-in namespace
import __builtin__

# Extended subclass
class mystr(str):
def first_last(self):
if self:
return self[0] + self[-1]
else:
return ''

# Substitute the original str with the subclass on the built-in namespace
__builtin__.str = mystr

print str(1234).first_last()
print str(0).first_last()
print str('').first_last()
print '0'.first_last()

output = """
14
00

Traceback (most recent call last):
File "strp.py", line 16, in <module>
print '0'.first_last()
AttributeError: 'str' object has no attribute 'first_last'
"""

How do I add my own custom attributes to existing built-in Python types? Like a string?

In short, you can't. The Python Way would be to subclass String and work from there.

Add custom method to string object

You can't because the builtin-types are coded in C. What you can do is subclass the type:

class string(str):
def sayHello(self):
print(self, "is saying 'hello'")

Test:

>>> x = string("test")
>>> x
'test'
>>> x.sayHello()
test is saying 'hello'

You could also overwrite the str-type with class str(str):, but that doesn't mean you can use the literal "test", because it is linking to the builtin str.

>>> x = "hello"
>>> x.sayHello()
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
x.sayHello()
AttributeError: 'str' object has no attribute 'sayHello'
>>> x = str("hello")
>>> x.sayHello()
hello is saying 'hello'

Creating custom list attributes

You cannot add methods or attributes to any of the built-in objects. This is by design.

Instead, you can create your own list type that is derived from the built-in one:

class MyList(list):
def even(self):
return [x for x in self if x % 2 == 0]

Demo:

>>> class MyList(list):
... def even(self):
... return [x for x in self if x % 2 == 0]
...
>>> MyList([1,2,3,4,5]).even()
[2, 4]

For more information, see Classes in the documentation, specifically the section on Inheritance.

Adding a method to an existing object instance

In Python, there is a difference between functions and bound methods.

>>> def foo():
... print "foo"
...
>>> class A:
... def bar( self ):
... print "bar"
...
>>> a = A()
>>> foo
<function foo at 0x00A98D70>
>>> a.bar
<bound method A.bar of <__main__.A instance at 0x00A9BC88>>
>>>

Bound methods have been "bound" (how descriptive) to an instance, and that instance will be passed as the first argument whenever the method is called.

Callables that are attributes of a class (as opposed to an instance) are still unbound, though, so you can modify the class definition whenever you want:

>>> def fooFighters( self ):
... print "fooFighters"
...
>>> A.fooFighters = fooFighters
>>> a2 = A()
>>> a2.fooFighters
<bound method A.fooFighters of <__main__.A instance at 0x00A9BEB8>>
>>> a2.fooFighters()
fooFighters

Previously defined instances are updated as well (as long as they haven't overridden the attribute themselves):

>>> a.fooFighters()
fooFighters

The problem comes when you want to attach a method to a single instance:

>>> def barFighters( self ):
... print "barFighters"
...
>>> a.barFighters = barFighters
>>> a.barFighters()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: barFighters() takes exactly 1 argument (0 given)

The function is not automatically bound when it's attached directly to an instance:

>>> a.barFighters
<function barFighters at 0x00A98EF0>

To bind it, we can use the MethodType function in the types module:

>>> import types
>>> a.barFighters = types.MethodType( barFighters, a )
>>> a.barFighters
<bound method ?.barFighters of <__main__.A instance at 0x00A9BC88>>
>>> a.barFighters()
barFighters

This time other instances of the class have not been affected:

>>> a2.barFighters()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: A instance has no attribute 'barFighters'

More information can be found by reading about descriptors and metaclass programming.

Set a custom object attribute

The arguments passed to setCustAttr are exactly the arguments you would pass to setattr.

def setCustAttr(self, name, value):
setattr(self, name, value)

Why would you want a wrapper around setattr? You might try to perform some validation:

def setCustAttr(self, name, value):
if name not in ['bar', 'baz']:
raise ValueError("Custom attribute must be 'bar' or 'baz'")
if name == 'bar' and value < 0:
raise ValueError("'bar' attribute must be non-negative")
if name == 'baz' and value % 2:
raise ValueError("'baz' attribute must be even")

setattr(self, name, value)

However, this doesn't prevent the user of your class from ignoring your setCustAttr method and assigning directly to the object:

g = MyClass()
g.bar = -5 # Negative bar!
g.baz = 3 # Odd baz!
g.quux = 2 # Non-bar/baz attribute!

Python has deep magic for providing more control over how attributes are set on an object (see __slots__, __{get,set}attr__, __getattribute__, properties, etc), but generally, they aren't used merely to prevent the examples shown above. The Python way is to just document how an instance of your class should be used, and trust the user to abide by your instructions. (And if they don't, caveat emptor.)

Custom list class in Python 3 with __get__ and __set__ attributes

While you can make your MyList class follow the descriptor protocol (which is what the __get__ and __set__ methods are for), you probably don't want to. That's because, to be useful, a descriptor must be placed as an attribute of a class, not as an attribute of an instance. The properties in your Foo class creating separate instances of MyList for each instance. That wouldn't work if the list was defined on the Foo class directly.

That's not to say that custom descriptors can't be useful. The property you're using in your Foo class is a descriptor. If you wanted to, you could write your own MyListAttr descriptor that does the same thing.

class MyListAttr(object):
def __init__(self):
self.name = None

def __set_name__(self, owner, name): # this is used in Pyton 3.6+
self.name = "_" + name

def find_name(self, cls): # this is used on earlier versions that don't support set_name
for name in dir(cls):
if getattr(cls, name) is self:
self.name = "_" + name
return
raise TypeError()

def __get__(self, obj, owner):
if obj is None:
return self
if self.name is None:
self.find_name(owner)
return getattr(obj, self.name)

def __set__(self, obj, value):
if self.name is None:
self.find_name(type(obj))
setattr(obj, self.name, MyList(value))

class Foo(object):
mylist = MyListAttr() # create the descriptor as a class variable

def __init__(self, data=None):
if data is None:
data = []

self.mylist = data # this invokes the __set__ method of the descriptor!

The MyListAttr class is more complicated than it otherwise might be because I try to have the descriptor object find its own name. That's not easy to figure out in older versions of Python. Starting with Python 3.6, it's much easier (because the __set_name__ method will be called on the descriptor when it is assigned as a class variable). A lot of the code in the class could be removed if you only needed to support Python 3.6 and later (you wouldn't need find_name or any of the code that calls it in __get__ and __set__).

It might not seem worth writing a long descriptor class like MyListAttr to do what you were able to do with less code using a property. That's probably correct if you only have one place you want to use the descriptor. But if you may have many classes (or many attributes within a single class) where you want the same special behavior, you will benefit from packing the behavior into a descriptor rather than writing a lot of very similar property getter and setter methods.

You might not have noticed, but I also made a change to the Foo class that is not directly related to the descriptor use. The change is to the default value for data. Using a mutable object like a list as a default argument is usually a very bad idea, as that same object will be shared by all calls to the function without an argument (so all Foo instances not initialized with data would share the same list). It's better to use a sentinel value (like None) and replace the sentinel with what you really want (a new empty list in this case). You probably should fix this issue in your MyList.__init__ method too.



Related Topics



Leave a reply



Submit