How Is _Eq_ Handled in Python and in What Order

How is __eq__ handled in Python and in what order?

The a == b expression invokes A.__eq__, since it exists. Its code includes self.value == other. Since int's don't know how to compare themselves to B's, Python tries invoking B.__eq__ to see if it knows how to compare itself to an int.

If you amend your code to show what values are being compared:

class A(object):
def __eq__(self, other):
print("A __eq__ called: %r == %r ?" % (self, other))
return self.value == other
class B(object):
def __eq__(self, other):
print("B __eq__ called: %r == %r ?" % (self, other))
return self.value == other

a = A()
a.value = 3
b = B()
b.value = 4
a == b

it will print:

A __eq__ called: <__main__.A object at 0x013BA070> == <__main__.B object at 0x013BA090> ?
B __eq__ called: <__main__.B object at 0x013BA090> == 3 ?

In what order should Python’s list.__contains__ invoke __eq__?

Per the language reference:

For container types such as list, tuple, set, frozenset, dict, or
collections.deque, the expression x in y is equivalent to any(x is
e or x == e for e in y)
.

The other examples in the same section show the same ordering for the equality test. This suggests that the comparison should be item_maybe_in_list.__eq__(item_actually_in_list), in which case this could be considered a bug in PyPy. Additionally, CPython is the reference implementation, so in any discrepancy that version wins!

That said, you should raise it with that community to see how they feel about it.

How should I implement __eq__ and __hash__ if I want order independent equality

The hash() method in python only works with immutable data type but set is a mutable data type, hence it is throwing an error. A frozenset or tuple which are immutable could be used instead.

Elegant ways to support equivalence (equality) in Python classes

Consider this simple problem:

class Number:

def __init__(self, number):
self.number = number

n1 = Number(1)
n2 = Number(1)

n1 == n2 # False -- oops

So, Python by default uses the object identifiers for comparison operations:

id(n1) # 140400634555856
id(n2) # 140400634555920

Overriding the __eq__ function seems to solve the problem:

def __eq__(self, other):
"""Overrides the default implementation"""
if isinstance(other, Number):
return self.number == other.number
return False

n1 == n2 # True
n1 != n2 # True in Python 2 -- oops, False in Python 3

In Python 2, always remember to override the __ne__ function as well, as the documentation states:

There are no implied relationships among the comparison operators. The
truth of x==y does not imply that x!=y is false. Accordingly, when
defining __eq__(), one should also define __ne__() so that the
operators will behave as expected.

def __ne__(self, other):
"""Overrides the default implementation (unnecessary in Python 3)"""
return not self.__eq__(other)

n1 == n2 # True
n1 != n2 # False

In Python 3, this is no longer necessary, as the documentation states:

By default, __ne__() delegates to __eq__() and inverts the result
unless it is NotImplemented. There are no other implied
relationships among the comparison operators, for example, the truth
of (x<y or x==y) does not imply x<=y.

But that does not solve all our problems. Let’s add a subclass:

class SubNumber(Number):
pass

n3 = SubNumber(1)

n1 == n3 # False for classic-style classes -- oops, True for new-style classes
n3 == n1 # True
n1 != n3 # True for classic-style classes -- oops, False for new-style classes
n3 != n1 # False

Note: Python 2 has two kinds of classes:

  • classic-style (or old-style) classes, that do not inherit from object and that are declared as class A:, class A(): or class A(B): where B is a classic-style class;

  • new-style classes, that do inherit from object and that are declared as class A(object) or class A(B): where B is a new-style class. Python 3 has only new-style classes that are declared as class A:, class A(object): or class A(B):.

For classic-style classes, a comparison operation always calls the method of the first operand, while for new-style classes, it always calls the method of the subclass operand, regardless of the order of the operands.

So here, if Number is a classic-style class:

  • n1 == n3 calls n1.__eq__;
  • n3 == n1 calls n3.__eq__;
  • n1 != n3 calls n1.__ne__;
  • n3 != n1 calls n3.__ne__.

And if Number is a new-style class:

  • both n1 == n3 and n3 == n1 call n3.__eq__;
  • both n1 != n3 and n3 != n1 call n3.__ne__.

To fix the non-commutativity issue of the == and != operators for Python 2 classic-style classes, the __eq__ and __ne__ methods should return the NotImplemented value when an operand type is not supported. The documentation defines the NotImplemented value as:

Numeric methods and rich comparison methods may return this value if
they do not implement the operation for the operands provided. (The
interpreter will then try the reflected operation, or some other
fallback, depending on the operator.) Its truth value is true.

In this case the operator delegates the comparison operation to the reflected method of the other operand. The documentation defines reflected methods as:

There are no swapped-argument versions of these methods (to be used
when the left argument does not support the operation but the right
argument does); rather, __lt__() and __gt__() are each other’s
reflection, __le__() and __ge__() are each other’s reflection, and
__eq__() and __ne__() are their own reflection.

The result looks like this:

def __eq__(self, other):
"""Overrides the default implementation"""
if isinstance(other, Number):
return self.number == other.number
return NotImplemented

def __ne__(self, other):
"""Overrides the default implementation (unnecessary in Python 3)"""
x = self.__eq__(other)
if x is NotImplemented:
return NotImplemented
return not x

Returning the NotImplemented value instead of False is the right thing to do even for new-style classes if commutativity of the == and != operators is desired when the operands are of unrelated types (no inheritance).

Are we there yet? Not quite. How many unique numbers do we have?

len(set([n1, n2, n3])) # 3 -- oops

Sets use the hashes of objects, and by default Python returns the hash of the identifier of the object. Let’s try to override it:

def __hash__(self):
"""Overrides the default implementation"""
return hash(tuple(sorted(self.__dict__.items())))

len(set([n1, n2, n3])) # 1

The end result looks like this (I added some assertions at the end for validation):

class Number:

def __init__(self, number):
self.number = number

def __eq__(self, other):
"""Overrides the default implementation"""
if isinstance(other, Number):
return self.number == other.number
return NotImplemented

def __ne__(self, other):
"""Overrides the default implementation (unnecessary in Python 3)"""
x = self.__eq__(other)
if x is not NotImplemented:
return not x
return NotImplemented

def __hash__(self):
"""Overrides the default implementation"""
return hash(tuple(sorted(self.__dict__.items())))

class SubNumber(Number):
pass

n1 = Number(1)
n2 = Number(1)
n3 = SubNumber(1)
n4 = SubNumber(4)

assert n1 == n2
assert n2 == n1
assert not n1 != n2
assert not n2 != n1

assert n1 == n3
assert n3 == n1
assert not n1 != n3
assert not n3 != n1

assert not n1 == n4
assert not n4 == n1
assert n1 != n4
assert n4 != n1

assert len(set([n1, n2, n3, ])) == 1
assert len(set([n1, n2, n3, n4])) == 2

python's a==b calls b.__eq__(a), for a subclass with no override

Here's the code that implements the described logic:

Python 2.7:

/* Macro to get the tp_richcompare field of a type if defined */
#define RICHCOMPARE(t) (PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE) \
? (t)->tp_richcompare : NULL)

...

static PyObject *
try_rich_compare(PyObject *v, PyObject *w, int op)
{
richcmpfunc f;
PyObject *res;

if (v->ob_type != w->ob_type &&
PyType_IsSubtype(w->ob_type, v->ob_type) &&
(f = RICHCOMPARE(w->ob_type)) != NULL) {
res = (*f)(w, v, _Py_SwappedOp[op]); // We're executing this
if (res != Py_NotImplemented)
return res;
Py_DECREF(res);
}
if ((f = RICHCOMPARE(v->ob_type)) != NULL) {
res = (*f)(v, w, op); // Instead of this.
if (res != Py_NotImplemented)
return res;
Py_DECREF(res);
}
if ((f = RICHCOMPARE(w->ob_type)) != NULL) {
return (*f)(w, v, _Py_SwappedOp[op]);
}
res = Py_NotImplemented;
Py_INCREF(res);
return res;
}

Python 3.x:

/* Perform a rich comparison, raising TypeError when the requested comparison
operator is not supported. */
static PyObject *
do_richcompare(PyObject *v, PyObject *w, int op)
{
richcmpfunc f;
PyObject *res;
int checked_reverse_op = 0;

if (v->ob_type != w->ob_type &&
PyType_IsSubtype(w->ob_type, v->ob_type) &&
(f = w->ob_type->tp_richcompare) != NULL) {
checked_reverse_op = 1;
res = (*f)(w, v, _Py_SwappedOp[op]); // We're executing this
if (res != Py_NotImplemented)
return res;
Py_DECREF(res);
}
if ((f = v->ob_type->tp_richcompare) != NULL) {
res = (*f)(v, w, op); // Instead of this.
if (res != Py_NotImplemented)
return res;
Py_DECREF(res);
}
if (!checked_reverse_op && (f = w->ob_type->tp_richcompare) != NULL) {
res = (*f)(w, v, _Py_SwappedOp[op]);
if (res != Py_NotImplemented)
return res;
Py_DECREF(res);
}

The two version are similar, except that the Python 2.7 version uses a RICHCOMPARE macro that checks PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE instead of ob_type->tp_richcompare != NULL.

In both versions, the first if block is evaluating to true. The specific piece that one would perhaps expect to be false, going by the description in the docs, is this: f = w->ob_type->tp_richcompare != NULL (for Py3) / PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE. However, the docs say that tp_richcompare is inherited by child classes:

richcmpfunc PyTypeObject.tp_richcompare

An optional pointer to the rich comparison function...

This field is inherited by subtypes together with tp_compare and tp_hash...

With the 2.x version, PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE will also evaluate to true, because the Py_TPFLAGS_HAVE_RICHCOMPARE flag is true if tp_richcompare, tp_clear, and tp_traverse are true, and all of those are inherited from the parent.

So, even though B doesn't provide its own rich comparison method, it still returns a non-NULL value because its parent class provides it. As others have stated, this seems to be a doc bug; the child class doesn't actually need to override the __eq__ method of the parent, it just needs to provide one, even via inheritance.

Why/When in Python does `x==y` call `y.__eq__(x)`?

You're missing a key exception to the usual behaviour: when the right-hand operand is an instance of a subclass of the class of the left-hand operand, the special method for the right-hand operand is called first.

See the documentation at:

http://docs.python.org/reference/datamodel.html#coercion-rules

and in particular, the following two paragraphs:

For objects x and y, first
x.__op__(y) is tried. If this is not
implemented or returns
NotImplemented, y.__rop__(x) is
tried. If this is also not implemented
or returns NotImplemented, a
TypeError exception is raised. But see
the following exception:

Exception to the previous item: if the
left operand is an instance of a
built-in type or a new-style class,
and the right operand is an instance
of a proper subclass of that type or
class and overrides the base’s
__rop__() method, the right
operand’s __rop__() method is tried
before the left operand’s __op__()
method.

Interaction between __hash__ and __eq__ in Python

You can easily figure this out by testing a few combinations. Consider these two types:

class AlwaysEqualConstantHash:
def __eq__(self, other):
print('AlwaysEqualConstantHash eq')
return True
def __hash__(self):
print('AlwaysEqualConstantHash hash')
return 4

class NeverEqualConstantHash:
def __eq__(self, other):
print('NeverEqualConstantHash eq')
return False
def __hash__(self):
print('NeverEqualConstantHash hash')
return 4

Now let’s put this inside a dictionary and see what happens:

>>> d = {}
>>> d[AlwaysEqualConstantHash()] = 'a'
AlwaysEqualConstantHash hash
>>> d[AlwaysEqualConstantHash()]
AlwaysEqualConstantHash hash
AlwaysEqualConstantHash eq
'a'
>>> d[AlwaysEqualConstantHash()] = 'b'
AlwaysEqualConstantHash hash
AlwaysEqualConstantHash eq
>>> d
{<__main__.AlwaysEqualConstantHash object at 0x00000083E8174A90>: 'b'}

As you can see, the hash is used all the time to address the element in the dictionary. And as soon as there is an element with the same hash inside the dictionary, the equality comparison is also made to figure whether the existing element is equal to the new one. So since all our new AlwaysEqualConstantHash objects are equal to another, they all can be used as the same key in the dictionary.

>>> d = {}
>>> d[NeverEqualConstantHash()] = 'a'
NeverEqualConstantHash hash
>>> d[NeverEqualConstantHash()]
NeverEqualConstantHash hash
NeverEqualConstantHash eq
Traceback (most recent call last):
File "<pyshell#56>", line 1, in <module>
d[NeverEqualConstantHash()]
KeyError: <__main__.NeverEqualConstantHash object at 0x00000083E8186BA8>
>>> d[NeverEqualConstantHash()] = 'b'
NeverEqualConstantHash hash
NeverEqualConstantHash eq
>>> d
{<__main__.NeverEqualConstantHash object at 0x00000083E8186F60>: 'a', <__main__.NeverEqualConstantHash object at 0x00000083E8186FD0>: 'b'}

For the NeverEqualConstantHash this is very different. The hash is also used all the time but since a new object is never equal to another, we cannot retrieve the existing objects that way.

>>> x = NeverEqualConstantHash()
>>> d[x] = 'foo'
NeverEqualConstantHash hash
NeverEqualConstantHash eq
NeverEqualConstantHash eq
>>> d[x]
NeverEqualConstantHash hash
NeverEqualConstantHash eq
NeverEqualConstantHash eq
'foo'

If we use the exact same key though, we can still retrieve the element since it won’t need to compare to itself using __eq__. We also see how the __eq__ is being called for every existing element with the same hash in order to check whether this new object is equal or not to another.

So yeah, the hash is being used to quickly sort the element into the dictionary. And the hash must be equal for elements that are considered equal. Only for hash collisions with existing elements the __eq__ is being used to make sure that both objects refer to the same element.

Python - __eq__ method not being called

Brandon's answer is informative, but incorrect. There are actually two problems, one with
the recipe relying on _CaptureEq being written as an old-style class (so it won't work properly if you try it on Python 3 with a hash-based container), and one with your own Foo.__eq__ definition claiming definitively that the two objects are not equal when it should be saying "I don't know, ask the other object if we're equal".

The recipe problem is trivial to fix: just define __hash__ on the comparison wrapper class:

class _CaptureEq:
'Object wrapper that remembers "other" for successful equality tests.'
def __init__(self, obj):
self.obj = obj
self.match = obj
# If running on Python 3, this will be a new-style class, and
# new-style classes must delegate hash explicitly in order to populate
# the underlying special method slot correctly.
# On Python 2, it will be an old-style class, so the explicit delegation
# isn't needed (__getattr__ will cover it), but it also won't do any harm.
def __hash__(self):
return hash(self.obj)
def __eq__(self, other):
result = (self.obj == other)
if result:
self.match = other
return result
def __getattr__(self, name): # support anything else needed by __contains__
return getattr(self.obj, name)

The problem with your own __eq__ definition is also easy to fix: return NotImplemented when appropriate so you aren't claiming to provide a definitive answer for comparisons with unknown objects:

class Foo(object):
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def __key(self):
return (self.a, self.b, self.c)
def __eq__(self, other):
if not isinstance(other, Foo):
# Don't recognise "other", so let *it* decide if we're equal
return NotImplemented
return self.__key() == other.__key()
def __hash__(self):
return hash(self.__key())

With those two fixes, you will find that Raymond's get_equivalent recipe works exactly as it should:

>>> from capture_eq import *
>>> bar_1 = Bar(1,2,3,4,5)
>>> bar_2 = Bar(1,2,3,10,11)
>>> summary = set((bar_1,))
>>> assert(bar_1 == bar_2)
>>> bar_equiv = get_equivalent(summary, bar_2)
>>> bar_equiv.d
4
>>> bar_equiv.e
5

Update: Clarified that the explicit __hash__ override is only needed in order to correctly handle the Python 3 case.

What is list.__eq__(self, other) supposed to do?

The class in question seems to be a sub-class of list, with some additional attributes like name, start, etc., and by calling list.__eq__(self, other), you explicitly call the __eq__ method of list (instead of the one defined in the subclass) to compare the two objects. This will likely compare the content of the two lists after their other attributes have been checked for equality.

Usually, cls.method(obj, *args) is equivalent to obj.method(*args), if obj is an instance of cls, but in this case, just calling self.__eq__(other) (or self == other) would call the same __eq__ method again, resulting in an infinite recursion.

Since you said that you are coming from Java: Provided that this class is a subclass from list, calling list.__eq__(self, other) is similar to calling super.equals(other).



Related Topics



Leave a reply



Submit