Is It Better to Use "Is" or "==" for Number Comparison in Python

Is it better to use is or == for number comparison in Python?

Use ==.

Sometimes, on some python implementations, by coincidence, integers from -5 to 256 will work with is (in CPython implementations for instance). But don't rely on this or use it in real programs.

Is there a difference between == and is?

is will return True if two variables point to the same object (in memory), == if the objects referred to by the variables are equal.

>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
>>> b == a
True

# Make a new copy of list `a` via the slice operator,
# and assign it to variable `b`
>>> b = a[:]
>>> b is a
False
>>> b == a
True

In your case, the second test only works because Python caches small integer objects, which is an implementation detail. For larger integers, this does not work:

>>> 1000 is 10**3
False
>>> 1000 == 10**3
True

The same holds true for string literals:

>>> "a" is "a"
True
>>> "aa" is "a" * 2
True
>>> x = "a"
>>> "aa" is x * 2
False
>>> "aa" is intern(x*2)
True

Please see this question as well.

compare a number to one of more numbers using OR in python

| is bitwise or operator in python. The logical OR is done using built-in or in python.

s = '936100'

if (int(s[0:4]) == 9361) or (int(s[0:4]) == 9363):
print('true')
else:
print('false')

# true

9361 | 9363 = 9363

Python None comparison: should I use is or ==?

Summary:

Use is when you want to check against an object's identity (e.g. checking to see if var is None). Use == when you want to check equality (e.g. Is var equal to 3?).

Explanation:

You can have custom classes where my_var == None will return True

e.g:

class Negator(object):
def __eq__(self,other):
return not other

thing = Negator()
print thing == None #True
print thing is None #False

is checks for object identity. There is only 1 object None, so when you do my_var is None, you're checking whether they actually are the same object (not just equivalent objects)

In other words, == is a check for equivalence (which is defined from object to object) whereas is checks for object identity:

lst = [1,2,3]
lst == lst[:] # This is True since the lists are "equivalent"
lst is lst[:] # This is False since they're actually different objects

Why does comparing number as a string in Python doesn't work as intended?

In Python, the > and < operators are implemented using a class's underlying __cmp__ magic function, the logic of which differs by class.

The logic for numbers is to compare the signed value, e.g. 12.6 >10 is true, and -4 > -33.02 is true, in the normal mathematical sense.

The logic for strings is to compare each character, left to right, based on the ordinal value of each character, e.g. 'apple' > 'banana' is false because ord('a') is 97 and ord('b') is 98, so 97 > 98 is false. Likewise '8 > '15' is true because ord('8') is 56, and and ord('1') is 49, so 56 > 49 is true.

Python doesn't convert strings-of-digits to numbers before comparing.

Why is it faster to compare strings that match than strings that do not?

Combining my comment and the comment by @khelwood:

TL;DR:

When analysing the bytecode for the two comparisons, it reveals the 'time' and 'time' strings are assigned to the same object. Therefore, an up-front identity check (at C-level) is the reason for the increased comparison speed.

The reason for the same object assignment is that, as an implementation detail, CPython interns strings which contain only 'name characters' (i.e. alpha and underscore characters). This enables the object's identity check.


Bytecode:

import dis

In [24]: dis.dis("'time'=='time'")
1 0 LOAD_CONST 0 ('time') # <-- same object (0)
2 LOAD_CONST 0 ('time') # <-- same object (0)
4 COMPARE_OP 2 (==)
6 RETURN_VALUE

In [25]: dis.dis("'time'=='1234'")
1 0 LOAD_CONST 0 ('time') # <-- different object (0)
2 LOAD_CONST 1 ('1234') # <-- different object (1)
4 COMPARE_OP 2 (==)
6 RETURN_VALUE

Assignment Timing:

The 'speed-up' can also be seen in using assignment for the time tests. The assignment (and compare) of two variables to the same string, is faster than the assignment (and compare) of two variables to different strings. Further supporting the hypothesis the underlying logic is performing an object comparison. This is confirmed in the next section.

In [26]: timeit.timeit("x='time'; y='time'; x==y", number=1000000)
Out[26]: 0.0745926329982467

In [27]: timeit.timeit("x='time'; y='1234'; x==y", number=1000000)
Out[27]: 0.10328884399496019

Python source code:

As helpfully provided by @mkrieger1 and @Masklinn in their comments, the source code for unicodeobject.c performs a pointer comparison first and if True, returns immediately.

int
_PyUnicode_Equal(PyObject *str1, PyObject *str2)
{
assert(PyUnicode_CheckExact(str1));
assert(PyUnicode_CheckExact(str2));
if (str1 == str2) { // <-- Here
return 1;
}
if (PyUnicode_READY(str1) || PyUnicode_READY(str2)) {
return -1;
}
return unicode_compare_eq(str1, str2);
}

Appendix:

  • Reference answer nicely illustrating how to read the disassembled bytecode output. Courtesy of @Delgan
  • Reference answer which nicely describes CPython's string interning. Coutresy of @ShadowRanger

Why does comparing strings using either '==' or 'is' sometimes produce a different result?

is is identity testing, and == is equality testing. What happens in your code would be emulated in the interpreter like this:

>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False

So, no wonder they're not the same, right?

In other words: a is b is the equivalent of id(a) == id(b)



Related Topics



Leave a reply



Submit