How do chained comparisons in Python actually work?
You can simply let Python tell you what bytecode is produced with the dis
module:
>>> import dis
>>> def f(): return 1 < input("Value:") < 10
...
>>> dis.dis(f)
1 0 LOAD_CONST 1 (1)
3 LOAD_GLOBAL 0 (input)
6 LOAD_CONST 2 ('Value:')
9 CALL_FUNCTION 1
12 DUP_TOP
13 ROT_THREE
14 COMPARE_OP 0 (<)
17 JUMP_IF_FALSE_OR_POP 27
20 LOAD_CONST 3 (10)
23 COMPARE_OP 0 (<)
26 RETURN_VALUE
>> 27 ROT_TWO
28 POP_TOP
29 RETURN_VALUE
Python uses a stack; the CALL_FUNCTION
bytecode uses items on the stack (the input
global and the 'Value:'
string) to call a function with one argument, to replace those two items on the stack with the result of the function call. Before the function call, the the constant 1
was loaded on the stack.
So by the time input
was called the stack looks like:
input_result
1
and DUP_TOP
duplicates the top value, before rotating the top three stack values to arrive at:
1
input_result
input_result
and a COMPARE_OP
to test the top two items with <
, replacing the top two items with the result.
If that result was False
the JUMP_IF_FALSE_OR_POP
bytecode jumps over to 27, which rotates the False
on top with the remaining input_result
to clear that out with a POP_TOP
, to then return the remaining False
top value as the result.
If the result True
however, that value is popped of the stack by the JUMP_IF_FALSE_OR_POP
bytecode and in it's place the 10
value is loaded on top and we get:
10
input_result
and another comparison is made and returned instead.
In summary, essentially Python then does this:
stack_1 = stack_2 = input('Value:')
if 1 < stack_1:
result = False
else:
result = stack_2 < 10
with the stack_*
values cleared again.
The stack, then, holds the unnamed intermediate result to compare
What does it mean that Python comparison operators chain/group left to right?
Grouping (this is what non-comparison operators do):
a + b + c means (a + b) + c
Chaining (this is what comparison operators do):
a < b < c means (a < b) and (b < c)
Grouping left to right (this is the way things are grouped):
5 - 2 - 1 means (5 - 2) - 1 == 2
as opposed to grouping right to left (this would produce a different result):
5 - (2 - 1) == 4
Chaining left to right
Chaining is left to right, so in a < b < c
, the expression a < b
is evaluated before b < c
, and if a < b
is falsey, b < c
is not evaluated.
(2 < 1 < f())
gives the value False
without calling the function f
, because 2 < 1
evaluates to false, so the second comparison does not need to be performed.
f() > 1 > g()
calls f()
in order to evaluate the first comparison, and depending on the result, it might or might not need to evaluate the second condition, which requires calling g()
.
NB. Each operand is evaluated at most once. So in the expression 1 < f() < 2
, the function f()
is only called once, and the value it gives is used in both comparisons (if necessary).
https://en.wikipedia.org/wiki/Short-circuit_evaluation
Precedence of chained comparisons?
Python has an Abstract Syntax Tree module to show you what's happening:
import ast
t = ast.parse('not a < b < c')
print(ast.dump(t))
It gives (cleaned up a bit):
[Expr(value=UnaryOp(
op=Not(),
operand=Compare(
left=Name(id='a'),
ops=[Lt(), Lt()],
comparators=[Name(id='b'), Name(id='c')]
)
))]
And indeed, the documentation says that not
has lower precedence than <
.
Why are chained operator expressions slower than their expanded equivalent?
In CPython's stack-based bytecode execution engine, saving an extra reference to b
for the chained comparison isn't free. It's at the "seriously, don't worry about it" level of cheap, but it's not literally free, and you're comparing it to the slightly cheaper operation of loading a local variable.
The COMPARE_OP
opcode removes the objects it's comparing from the stack, so for the chained comparison, Python has to create another reference to b
(DUP_TOP
) and shove it two places down in the stack (ROT_THREE
) to get it out of the way.
In a <= b and b <= c
, instead of the above reference shuffling, Python just copies another reference to b
out of the stack frame's fastlocals
array. This involves less pointer shuffling and one less trip around the bytecode evaluation loop, so it's slightly cheaper.
Can you simplify chained comparisons with not equal or equal in Python?
this is functionnaly equivalent, but when this:
10 < x < 40
is nice to read, mixing different operators to use chained comparisons isn't the best choice.
Is that really the same? let's disassemble to find out:
def f1(x,y):
if y > x and x != -1:
return 0
def f2(x,y):
if y > x != -1:
return 0
import dis
print("function 1")
dis.dis(f1)
print("function 2")
dis.dis(f2)
result:
function 1
2 0 LOAD_FAST 1 (y)
3 LOAD_FAST 0 (x)
6 COMPARE_OP 4 (>)
9 POP_JUMP_IF_FALSE 28
12 LOAD_FAST 0 (x)
15 LOAD_CONST 3 (-1)
18 COMPARE_OP 3 (!=)
21 POP_JUMP_IF_FALSE 28
3 24 LOAD_CONST 2 (0)
27 RETURN_VALUE
>> 28 LOAD_CONST 0 (None)
31 RETURN_VALUE
function 2
6 0 LOAD_FAST 1 (y)
3 LOAD_FAST 0 (x)
6 DUP_TOP
7 ROT_THREE
8 COMPARE_OP 4 (>)
11 JUMP_IF_FALSE_OR_POP 23
14 LOAD_CONST 3 (-1)
17 COMPARE_OP 3 (!=)
20 JUMP_FORWARD 2 (to 25)
>> 23 ROT_TWO
24 POP_TOP
>> 25 POP_JUMP_IF_FALSE 32
7 28 LOAD_CONST 2 (0)
31 RETURN_VALUE
>> 32 LOAD_CONST 0 (None)
35 RETURN_VALUE
>>>
Surprisingly they aren't the same, and the chained version has more instructions.
Not sure what's going on here (some took some more time to explain it better: How do chained comparisons in Python actually work?), but really I'd stick to the and
version which shortcuts and is so much readable (think of future maintainers too...).
That said, one interesting thing about chained comparisons would be if the central argument is computed/takes a long time to compute/has a side effect in the computation and you don't want to store it in a variable:
if y > super_long_computation(x) != -1:
In that case, the central argument is only evaluated once. In the case of and
, you'd have to store it beforehand.
Simplify Chained Comparison
In Python you can "chain" comparison operations which just means they are "and"ed together. In your case, it'd be like this:
if start <= x <= end:
Reference: https://docs.python.org/3/reference/expressions.html#comparisons
Custom chained comparisons
Python allows expressions like
x > y > z
, which, according to the docs, is equivalent to(x > y) and (y > z)
excepty
is only evaluated once.
According to this, low > high > low
will be equivalent to (low > high) and (high > low)
.
>>> x = low > high # CompareList([False])
>>> y = high > low # CompareList([True])
>>> x and y
CompareList([True])
More from the documentation on x and y:
x and y
: ifx
is false, thenx
, elsey
In the above case:
>>> x is False
False
>>> x if x is False else y # x and y
CompareList([True])
so when you do x and y
it returns the y
which is CompareList([True])
.
How does three operands comparison work in Python under the hood?
The comparison grammar is not that interesting here, it simply lets you append multiple comparators to an operator:
comparison ::= or_expr ( comp_operator or_expr )*
comp_operator ::= "<" | ">" | "==" | ">=" | "<=" | "!="
| "is" ["not"] | ["not"] "in"
So lets ask the Python parser directly, by using the ast
module (which simply asks the Python compiler itself to only return the Abstract Syntax Tree):
>>> import ast
>>> ast.dump(ast.parse('a > b > c', mode='eval'))
"Expression(body=Compare(left=Name(id='a', ctx=Load()), ops=[Gt(), Gt()], comparators=[Name(id='b', ctx=Load()), Name(id='c', ctx=Load())]))"
So there is just a single Compare
node, with multiple operators and comparators:
Compare(
left=Name(id='a'),
ops=[Gt(), Gt()],
comparators=[Name(id='b'), Name(id='c')])
(I omitted the Expression
and ctx
parts).
This lets the interpreter evaluate comparators as needed (e.g. if a < b
is false, the remaining comparators don't have to be considered).
The resulting bytecode uses conditional jumps to skip remaining comparisons:
>>> import dis
>>> dis.dis(compile('a > b > c', '', 'eval'))
1 0 LOAD_NAME 0 (a)
2 LOAD_NAME 1 (b)
4 DUP_TOP
6 ROT_THREE
8 COMPARE_OP 4 (>)
10 JUMP_IF_FALSE_OR_POP 18
12 LOAD_NAME 2 (c)
14 COMPARE_OP 4 (>)
16 RETURN_VALUE
>> 18 ROT_TWO
20 POP_TOP
22 RETURN_VALUE
Why 5 = -15 == 5 = 1 != 20 is False?
Python's operator precedence is documented here. Contrary to what you claim, all of those comparison operators have the same precedence.
Normally, one looks to associativity to handle ties in precedence. But the comparison operators are special. Python supports chaining. This means that
x relop1 y relop2 z
is short for
x relop1 y and y relop2 z
(Except y
is only evaluated once.)
So,
5 <= -15 == 5 >= 1 != 20
= ( 5 <= -15 ) and ( -15 == 5 ) and ( 5 >= 1 ) and ( 1 != 20 )
= False and False and True and True
= False
Related Topics
Restricting the Value in Tkinter Entry Widget
Numpy.Where() Detailed, Step-By-Step Explanation/Examples
Python Daemon and Systemd Service
How to Remove Leading and Trailing Zeros in a String? Python
How to "Zip Sort" Parallel Numpy Arrays
How to Print a Dictionary's Key
Cannot Display an Image in Tkinter
How to Tell a Python Script to Use a Particular Version
Python Integer Incrementing with ++
Differencebetween a Pandas Series and a Single-Column Dataframe
How to Generate Random Numbers That Are Different