Elegant Ways to Support Equivalence ("Equality") in Python Classes

Elegant ways to support equivalence ( equality ) in Python classes

Consider this simple problem:

class Number:

    def __init__(self, number):
        self.number = number

n1 = Number(1)
n2 = Number(1)

n1 == n2 # False -- oops

So, Python by default uses the object identifiers for comparison operations:

id(n1) # 140400634555856
id(n2) # 140400634555920

Overriding the __eq__ function seems to solve the problem:

def __eq__(self, other):
    """Overrides the default implementation"""
    if isinstance(other, Number):
        return self.number == other.number
    return False

n1 == n2 # True
n1 != n2 # True in Python 2 -- oops, False in Python 3

In Python 2, always remember to override the __ne__ function as well, as the documentation states:

There are no implied relationships among the comparison operators. The
truth of x==y does not imply that x!=y is false. Accordingly, when
defining __eq__(), one should also define __ne__() so that the
operators will behave as expected.

def __ne__(self, other):
    """Overrides the default implementation (unnecessary in Python 3)"""
    return not self.__eq__(other)

n1 == n2 # True
n1 != n2 # False

In Python 3, this is no longer necessary, as the documentation states:

By default, __ne__() delegates to __eq__() and inverts the result
unless it is NotImplemented. There are no other implied
relationships among the comparison operators, for example, the truth
of (x<y or x==y) does not imply x<=y.

But that does not solve all our problems. Let’s add a subclass:

class SubNumber(Number):
    pass

n3 = SubNumber(1)

n1 == n3 # False for classic-style classes -- oops, True for new-style classes
n3 == n1 # True
n1 != n3 # True for classic-style classes -- oops, False for new-style classes
n3 != n1 # False

Note: Python 2 has two kinds of classes:

classic-style (or old-style) classes, that do not inherit from object and that are declared as class A:, class A(): or class A(B): where B is a classic-style class;
new-style classes, that do inherit from object and that are declared as class A(object) or class A(B): where B is a new-style class. Python 3 has only new-style classes that are declared as class A:, class A(object): or class A(B):.

For classic-style classes, a comparison operation always calls the method of the first operand, while for new-style classes, it always calls the method of the subclass operand, regardless of the order of the operands.

So here, if Number is a classic-style class:

n1 == n3 calls n1.__eq__;
n3 == n1 calls n3.__eq__;
n1 != n3 calls n1.__ne__;
n3 != n1 calls n3.__ne__.

And if Number is a new-style class:

both n1 == n3 and n3 == n1 call n3.__eq__;
both n1 != n3 and n3 != n1 call n3.__ne__.

To fix the non-commutativity issue of the == and != operators for Python 2 classic-style classes, the __eq__ and __ne__ methods should return the NotImplemented value when an operand type is not supported. The documentation defines the NotImplemented value as:

Numeric methods and rich comparison methods may return this value if
they do not implement the operation for the operands provided. (The
interpreter will then try the reflected operation, or some other
fallback, depending on the operator.) Its truth value is true.

In this case the operator delegates the comparison operation to the reflected method of the other operand. The documentation defines reflected methods as:

There are no swapped-argument versions of these methods (to be used
when the left argument does not support the operation but the right
argument does); rather, __lt__() and __gt__() are each other’s
reflection, __le__() and __ge__() are each other’s reflection, and
__eq__() and __ne__() are their own reflection.

The result looks like this:

def __eq__(self, other):
    """Overrides the default implementation"""
    if isinstance(other, Number):
        return self.number == other.number
    return NotImplemented

def __ne__(self, other):
    """Overrides the default implementation (unnecessary in Python 3)"""
    x = self.__eq__(other)
    if x is NotImplemented:
        return NotImplemented
    return not x

Returning the NotImplemented value instead of False is the right thing to do even for new-style classes if commutativity of the == and != operators is desired when the operands are of unrelated types (no inheritance).

Are we there yet? Not quite. How many unique numbers do we have?

len(set([n1, n2, n3])) # 3 -- oops

Sets use the hashes of objects, and by default Python returns the hash of the identifier of the object. Let’s try to override it:

def __hash__(self):
    """Overrides the default implementation"""
    return hash(tuple(sorted(self.__dict__.items())))

len(set([n1, n2, n3])) # 1

The end result looks like this (I added some assertions at the end for validation):

class Number:

    def __init__(self, number):
        self.number = number

    def __eq__(self, other):
        """Overrides the default implementation"""
        if isinstance(other, Number):
            return self.number == other.number
        return NotImplemented

    def __ne__(self, other):
        """Overrides the default implementation (unnecessary in Python 3)"""
        x = self.__eq__(other)
        if x is not NotImplemented:
            return not x
        return NotImplemented

    def __hash__(self):
        """Overrides the default implementation"""
        return hash(tuple(sorted(self.__dict__.items())))

class SubNumber(Number):
    pass

n1 = Number(1)
n2 = Number(1)
n3 = SubNumber(1)
n4 = SubNumber(4)

assert n1 == n2
assert n2 == n1
assert not n1 != n2
assert not n2 != n1

assert n1 == n3
assert n3 == n1
assert not n1 != n3
assert not n3 != n1

assert not n1 == n4
assert not n4 == n1
assert n1 != n4
assert n4 != n1

assert len(set([n1, n2, n3, ])) == 1
assert len(set([n1, n2, n3, n4])) == 2

Is there a difference between == and is ?

is will return True if two variables point to the same object (in memory), == if the objects referred to by the variables are equal.

>>> a = [1, 2, 3]
>>> b = a
>>> b is a 
True
>>> b == a
True

# Make a new copy of list `a` via the slice operator, 
# and assign it to variable `b`
>>> b = a[:] 
>>> b is a
False
>>> b == a
True

In your case, the second test only works because Python caches small integer objects, which is an implementation detail. For larger integers, this does not work:

>>> 1000 is 10**3
False
>>> 1000 == 10**3
True

The same holds true for string literals:

>>> "a" is "a"
True
>>> "aa" is "a" * 2
True
>>> x = "a"
>>> "aa" is x * 2
False
>>> "aa" is intern(x*2)
True

Please see this question as well.

Should ne be implemented as the negation of eq?

Yes, that's perfectly fine. In fact, the documentation urges you to define __ne__ when you define __eq__:

There are no implied relationships
among the comparison operators. The
truth of x==y does not imply that x!=y
is false. Accordingly, when defining
__eq__(), one should also define __ne__() so that the operators will behave as expected.

In a lot of cases (such as this one), it will be as simple as negating the result of __eq__, but not always.

How is eq handled in Python and in what order?

The a == b expression invokes A.__eq__, since it exists. Its code includes self.value == other. Since int's don't know how to compare themselves to B's, Python tries invoking B.__eq__ to see if it knows how to compare itself to an int.

If you amend your code to show what values are being compared:

class A(object):
    def __eq__(self, other):
        print("A __eq__ called: %r == %r ?" % (self, other))
        return self.value == other
class B(object):
    def __eq__(self, other):
        print("B __eq__ called: %r == %r ?" % (self, other))
        return self.value == other

a = A()
a.value = 3
b = B()
b.value = 4
a == b

it will print:

A __eq__ called: <__main__.A object at 0x013BA070> == <__main__.B object at 0x013BA090> ?
B __eq__ called: <__main__.B object at 0x013BA090> == 3 ?

What are the differences between type() and isinstance()?

To summarize the contents of other (already good!) answers, isinstance caters for inheritance (an instance of a derived class is an instance of a base class, too), while checking for equality of type does not (it demands identity of types and rejects instances of subtypes, AKA subclasses).

Normally, in Python, you want your code to support inheritance, of course (since inheritance is so handy, it would be bad to stop code using yours from using it!), so isinstance is less bad than checking identity of types because it seamlessly supports inheritance.

It's not that isinstance is good, mind you—it's just less bad than checking equality of types. The normal, Pythonic, preferred solution is almost invariably "duck typing": try using the argument as if it was of a certain desired type, do it in a try/except statement catching all exceptions that could arise if the argument was not in fact of that type (or any other type nicely duck-mimicking it;-), and in the except clause, try something else (using the argument "as if" it was of some other type).

basestring is, however, quite a special case—a builtin type that exists only to let you use isinstance (both str and unicode subclass basestring). Strings are sequences (you could loop over them, index them, slice them, ...), but you generally want to treat them as "scalar" types—it's somewhat incovenient (but a reasonably frequent use case) to treat all kinds of strings (and maybe other scalar types, i.e., ones you can't loop on) one way, all containers (lists, sets, dicts, ...) in another way, and basestring plus isinstance helps you do that—the overall structure of this idiom is something like:

if isinstance(x, basestring)
  return treatasscalar(x)
try:
  return treatasiter(iter(x))
except TypeError:
  return treatasscalar(x)

You could say that basestring is an Abstract Base Class ("ABC")—it offers no concrete functionality to subclasses, but rather exists as a "marker", mainly for use with isinstance. The concept is obviously a growing one in Python, since PEP 3119, which introduces a generalization of it, was accepted and has been implemented starting with Python 2.6 and 3.0.

The PEP makes it clear that, while ABCs can often substitute for duck typing, there is generally no big pressure to do that (see here). ABCs as implemented in recent Python versions do however offer extra goodies: isinstance (and issubclass) can now mean more than just "[an instance of] a derived class" (in particular, any class can be "registered" with an ABC so that it will show as a subclass, and its instances as instances of the ABC); and ABCs can also offer extra convenience to actual subclasses in a very natural way via Template Method design pattern applications (see here and here [[part II]] for more on the TM DP, in general and specifically in Python, independent of ABCs).

For the underlying mechanics of ABC support as offered in Python 2.6, see here; for their 3.1 version, very similar, see here. In both versions, standard library module collections (that's the 3.1 version—for the very similar 2.6 version, see here) offers several useful ABCs.

For the purpose of this answer, the key thing to retain about ABCs (beyond an arguably more natural placement for TM DP functionality, compared to the classic Python alternative of mixin classes such as UserDict.DictMixin) is that they make isinstance (and issubclass) much more attractive and pervasive (in Python 2.6 and going forward) than they used to be (in 2.5 and before), and therefore, by contrast, make checking type equality an even worse practice in recent Python versions than it already used to be.

How to make same instance with same arguments but different for different arguments in python

There is no reason to think about singletons here and no reason to do anything with __new__. If you want two instances to be considered equal based on some condition, then you need to define __eq__.

def __eq__(self, other):
    return isinstance(other, A) and self.fn == other.fn

(Note, fn is usually used as a holder for functions; you should think of another attribute name.)

Elegant Ways to Support Equivalence ("Equality") in Python Classes