How Dangerous Is Setting Self._Class_ to Something Else

How dangerous is setting self.__class__ to something else?

Here's a list of things I can think of that make this dangerous, in rough order from worst to least bad:

  • It's likely to be confusing to someone reading or debugging your code.
  • You won't have gotten the right __init__ method, so you probably won't have all of the instance variables initialized properly (or even at all).
  • The differences between 2.x and 3.x are significant enough that it may be painful to port.
  • There are some edge cases with classmethods, hand-coded descriptors, hooks to the method resolution order, etc., and they're different between classic and new-style classes (and, again, between 2.x and 3.x).
  • If you use __slots__, all of the classes must have identical slots. (And if you have the compatible but different slots, it may appear to work at first but do horrible things…)
  • Special method definitions in new-style classes may not change. (In fact, this will work in practice with all current Python implementations, but it's not documented to work, so…)
  • If you use __new__, things will not work the way you naively expected.
  • If the classes have different metaclasses, things will get even more confusing.

Meanwhile, in many cases where you'd think this is necessary, there are better options:

  • Use a factory to create an instance of the appropriate class dynamically, instead of creating a base instance and then munging it into a derived one.
  • Use __new__ or other mechanisms to hook the construction.
  • Redesign things so you have a single class with some data-driven behavior, instead of abusing inheritance.

As a very most common specific case of the last one, just put all of the "variable methods" into classes whose instances are kept as a data member of the "parent", rather than into subclasses. Instead of changing self.__class__ = OtherSubclass, just do self.member = OtherSubclass(self). If you really need methods to magically change, automatic forwarding (e.g., via __getattr__) is a much more common and pythonic idiom than changing classes on the fly.

Is it safe to replace a self object by another object of the same type in a method?

It is unlikely that replacing the 'self' variable will accomplish whatever you're trying to do, that couldn't just be accomplished by storing the result of func(self) in a different variable. 'self' is effectively a local variable only defined for the duration of the method call, used to pass in the instance of the class which is being operated upon. Replacing self will not actually replace references to the original instance of the class held by other objects, nor will it create a lasting reference to the new instance which was assigned to it.

What is the purpose of checking self.__class__?

self.__class__ is a reference to the type of the current instance.

For instances of abstract1, that'd be the abstract1 class itself, which is what you don't want with an abstract class. Abstract classes are only meant to be subclassed, not to create instances directly:

>>> abstract1()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in __init__
NotImplementedError: Interfaces can't be instantiated

For an instance of a subclass of abstract1, self.__class__ would be a reference to the specific subclass:

>>> class Foo(abstract1): pass
...
>>> f = Foo()
>>> f.__class__
<class '__main__.Foo'>
>>> f.__class__ is Foo
True

Throwing an exception here is like using an assert statement elsewhere in your code, it protects you from making silly mistakes.

Note that the pythonic way to test for the type of an instance is to use the type() function instead, together with an identity test with the is operator:

class abstract1(object):
def __init__(self):
if type(self) is abstract1:
raise NotImplementedError("Interfaces can't be instantiated")

type() should be preferred over self.__class__ because the latter can be shadowed by a class attribute.

There is little point in using an equality test here as for custom classes, __eq__ is basically implemented as an identity test anyway.

Python also includes a standard library to define abstract base classes, called abc. It lets you mark methods and properties as abstract and will refuse to create instances of any subclass that has not yet re-defined those names.

How to avoid explicit 'self' in Python?

In Java terms: Python doesn't have member functions, all class functions are static, and are called with a reference to the actual class instance as first argument when invoked as member function.

This means that when your code has a class MyClass and you build an instance m = MyClass(), calling m.do_something() will be executed as MyClass.do_something(m).

Also note that this first argument can technically be called anything you want, but the convention is to use self, and you should stick to that convention if you want others (including your future self) to be able to easily read your code.

The result is there's never any confusion over what's a member and what's not, even without the full class definition visible. This leads to useful properties, such as: you can't add members which accidentally shadow non-members and thereby break code.

One extreme example: you can write a class without any knowledge of what base classes it might have, and always know whether you are accessing a member or not:

class A(some_function()):
def f(self):
self.member = 42
self.method()

That's the complete code! (some_function returns the type used as a base.)

Another, where the methods of a class are dynamically composed:

class B(object):
pass

print B()
# <__main__.B object at 0xb7e4082c>

def B_init(self):
self.answer = 42
def B_str(self):
return "<The answer is %s.>" % self.answer
# notice these functions require no knowledge of the actual class
# how hard are they to read and realize that "members" are used?

B.__init__ = B_init
B.__str__ = B_str

print B()
# <The answer is 42.>

Remember, both of these examples are extreme and you won't see them every day, nor am I suggesting you should often write code like this, but they do clearly show aspects of self being explicitly required.

Python class methods: when is self not needed

What is self?

In Python, every normal method is forced to accept a parameter commonly named self. This is an instance of class - an object. This is how Python methods interact with a class's state.

You are allowed to rename this parameter whatever you please. but it will always have the same value:

>>> class Class:
def method(foo): #
print(foo)


>>> cls = Class()
>>> cls.method()
<__main__.F object at 0x03E41D90>
>>>

But then why does my example work?

However, what you are probably confused about is how this code works differently:

>>> class Class:
def method(foo):
print(foo)

methods = {'method': method}

def __init__(self):
self.run = self.methods['method']


>>> cls = Class()
>>> cls.run(3)
3
>>>

This is because of the distinction between bound, and unbound methods in Python.

When we do this in __init__():

self.run = self.methods['method']

We are referring to the unbound method method. That means that our reference to method is not bound to any specific instance of Class, and thus, Python will not force method to accept an object instance. because it does not have one to give.

The above code would be the same as doing this:

>>> class Class:
def method(foo):
print(foo)


>>> Class.method(3)
3
>>>

In both examples, we are calling the method method of the class object Class , and not an instance of the Class object.

We can further see this distinction by examining the repr for a bound and unbound method:

>>> class Class:
def method(foo):
print(foo)


>>> Class.method
<function Class.method at 0x03E43D68>
>>> cls = Class()
>>> cls.method
<bound method Class.method of <__main__.Class object at 0x03BD2FB0>>
>>>

As you can see, in the first example when we do Class.method, Python shows:
<function Class.method at 0x03E43D68>. I've lied to you a little bit. When we have an unbound method of a class, Python treats them as plain functions. So method is simply a function that is not bound to any instance of `Class.

However in the second example, when we create an instance of Class, and then access the method object of it, we see printed: <bound method Class.method of <__main__.Class object at 0x03BD2FB0>>.

The key part to notice is bound method Class.method. That means method is **bound** to cls - a specfic an instance of Class.

General remarks

As @jonshapre mentioned, writing code like in your example leads to confusion (as proof by this question), and bugs. It would be a better idea if you simply defined nonLinearBipolarStep() outside of Activation, and reference that from inside of Activation.activation_functions:

def nonLinearBipolarStep(self,x,string=None):
if not string: return (-1 if x<0 else 1 )
else: return ('-' if x<0 else '1')

class Activation:

activation_functions = {
'bipolar': nonLinearBipolarStep,
}

...

I guess a more specific question would be: what should I pay attention to on that code in order to become evident that ag.run(x) would be a call to an unbound function?

If you'd still like to let nonLinearBipolarStep be unbound, then I recommend simply being carefully. If you think your method would make for the cleanest code then go for it, but make sure you know what you are doing and the behavior your code will have.

If you still wanted to make is clear to users of your class that ag.run() would be static, you could document it in a docstring somewhere, but that is something the user really shouldn't even have to be concerned with at all.

Python: self vs type(self) and the proper use of class variables

After speaking with others offline (and per @wwii's comment on one of the answers here), it turns out the best way to do this without embedding the class name explicitly is to use self.__class__.attribute.

(While some people out there use type(self).attribute it causes other problems.)

Understanding Python super() with __init__() methods

super() lets you avoid referring to the base class explicitly, which can be nice. But the main advantage comes with multiple inheritance, where all sorts of fun stuff can happen. See the standard docs on super if you haven't already.

Note that the syntax changed in Python 3.0: you can just say super().__init__() instead of super(ChildB, self).__init__() which IMO is quite a bit nicer. The standard docs also refer to a guide to using super() which is quite explanatory.

Purpose of return self python

Returning self from a method simply means that your method returns a reference to the instance object on which it was called. This can sometimes be seen in use with object oriented APIs that are designed as a fluent interface that encourages method cascading. So, for example,

>>> class Counter(object):
... def __init__(self, start=1):
... self.val = start
... def increment(self):
... self.val += 1
... return self
... def decrement(self):
... self.val -= 1
... return self
...
>>> c = Counter()

Now we can use method cascading:

>>> c.increment().increment().decrement()
<__main__.Counter object at 0x1020c1390>

Notice, the last call to decrement() returned <__main__.Counter object at 0x1020c1390>, which is self.
Now:

>>> c.val
2
>>>

Notice, you cannot do this if you did not return self:

>>> class Counter(object):
... def __init__(self, start=1):
... self.val = start
... def increment(self):
... self.val += 1
... # implicitely return `None`
... def decrement(self):
... self.val -= 1
... # implicitely return `None`
...
>>> c = Counter()
>>> c.increment().increment()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'increment'
>>> c
<__main__.Counter object at 0x1020c15f8>
>>> c.val
2
>>>

Notice, not everyone is a fan of "method cascading" design. Python built-ins do not tend do this, so, list for example:

>>> x = list()
>>> x.append(1).append(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'append'
>>>

The one place you do often see this is when your class implements the iterator protocol, where iter on an iterator returns self by convention, although this is suggested by the docs:

Having seen the mechanics behind the iterator protocol, it is easy to
add iterator behavior to your classes. Define an __iter__() method
which returns an object with a __next__() method. If the class
defines __next__(), then __iter__() can just return self:

class Reverse:
"""Iterator for looping over a sequence backwards."""
def __init__(self, data):
self.data = data
self.index = len(data)

def __iter__(self):
return self

def __next__(self):
if self.index == 0:
raise StopIteration
self.index = self.index - 1
return self.data[self.index]

Notice, this in effect makes your iterator only useful for a single pass (as it should be to properly follow the iterator protocol):

>>> x = [1, 2, 3, 4]
>>> it = iter(x)
>>> list(it)
[1, 2, 3, 4]
>>> list(it)
[]
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>


Related Topics



Leave a reply



Submit