Python: Why Does ("Hello" Is "Hello") Evaluate as True

Python: Why does (hello is hello) evaluate as True?

Python (like Java, C, C++, .NET) uses string pooling / interning. The interpreter realises that "hello" is the same as "hello", so it optimizes and uses the same location in memory.

Another goodie: "hell" + "o" is "hello" ==> True

Why does a == x or y or z always evaluate to True? How can I compare a to all of those?

In many cases, Python looks and behaves like natural English, but this is one case where that abstraction fails. People can use context clues to determine that "Jon" and "Inbar" are objects joined to the verb "equals", but the Python interpreter is more literal minded.

if name == "Kevin" or "Jon" or "Inbar":

is logically equivalent to:

if (name == "Kevin") or ("Jon") or ("Inbar"):

Which, for user Bob, is equivalent to:

if (False) or ("Jon") or ("Inbar"):

The or operator chooses the first operand that is "truthy", i.e. which would satisfy an if condition (or the last one, if none of them are "truthy"):

if "Jon":

Since "Jon" is truthy, the if block executes. That is what causes "Access granted" to be printed regardless of the name given.

All of this reasoning also applies to the expression if "Kevin" or "Jon" or "Inbar" == name. the first value, "Kevin", is true, so the if block executes.


There are two common ways to properly construct this conditional.

  1. Use multiple == operators to explicitly check against each value:

    if name == "Kevin" or name == "Jon" or name == "Inbar":
  2. Compose a collection of valid values (a set, a list or a tuple for example), and use the in operator to test for membership:

    if name in {"Kevin", "Jon", "Inbar"}:

In general of the two the second should be preferred as it's easier to read and also faster:

>>> import timeit
>>> timeit.timeit('name == "Kevin" or name == "Jon" or name == "Inbar"',
setup="name='Inbar'")
0.4247764749999945
>>> timeit.timeit('name in {"Kevin", "Jon", "Inbar"}', setup="name='Inbar'")
0.18493307199999265

For those who may want proof that if a == b or c or d or e: ... is indeed parsed like this. The built-in ast module provides an answer:

>>> import ast
>>> ast.parse("a == b or c or d or e", "<string>", "eval")
<ast.Expression object at 0x7f929c898220>
>>> print(ast.dump(_, indent=4))
Expression(
body=BoolOp(
op=Or(),
values=[
Compare(
left=Name(id='a', ctx=Load()),
ops=[
Eq()],
comparators=[
Name(id='b', ctx=Load())]),
Name(id='c', ctx=Load()),
Name(id='d', ctx=Load()),
Name(id='e', ctx=Load())]))

As one can see, it's the boolean operator or applied to four sub-expressions: comparison a == b; and simple expressions c, d, and e.

Why does hello evaluate as true in a boolean condition?

C doesn't really have boolean (true or false) values (C99 does, but my notes below still apply).

What C interprets as false is anything "0"; everything else is true;

so

if (0) {} else {printf("0 is false\n");}
if (NULL) {} else {printf("NULL is false\n");}
if (0.0) {} else {printf("0.0 is false\n");}

a literal string is interpreted as a pointer ... and it is pointing to real characters, so it's true

if (1) {printf("1 is true\n");} else {}
if (-1) {printf("-1 is true\n");} else {}
if ("hello") {printf("\"hello\" is true\n");} else {}
if (3.14159) {printf("3.14159 is true\n");} else {}

Interestingly an empty string or the string "0" or the character '0' is true

if ("") {printf("\"\" is true\n");} else {}
if ("0") {printf("\"0\" is true\n");} else {}
if ('0') {printf("'0' is true\n");} else {}

The NUL character (not NULL which is a pointer) has int value 0 and is false

if ('\0') {} else {printf("'\\0' is false\n");}

What happens when you have a real boolean construct is that the compiler emits code to convert that to 0 or 1

if (a > b) /* whatever */;
// if a is greater than b, the compiler generated code will be something like
if (1) /* whatever */;
// otherwise, if a <= b, the generated code would look like
if (0) /* whatever */;

Is there a difference between == and is?

is will return True if two variables point to the same object (in memory), == if the objects referred to by the variables are equal.

>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
>>> b == a
True

# Make a new copy of list `a` via the slice operator,
# and assign it to variable `b`
>>> b = a[:]
>>> b is a
False
>>> b == a
True

In your case, the second test only works because Python caches small integer objects, which is an implementation detail. For larger integers, this does not work:

>>> 1000 is 10**3
False
>>> 1000 == 10**3
True

The same holds true for string literals:

>>> "a" is "a"
True
>>> "aa" is "a" * 2
True
>>> x = "a"
>>> "aa" is x * 2
False
>>> "aa" is intern(x*2)
True

Please see this question as well.

Logical Expressions: Why does str1 in str2 or str2 not in str1 return True for my print statement?

Why your functions returns True ?

The if statement if string1 in string2 or string2 not in string1 is made of 3 parts:

  1. string1 in string2
  2. or
  3. string2 not in string1

And you have:

string1 = 'hello world'
string2 = 'world hello'
  • Part 1 (string1 in string2) :

    It evaluates to False, because 'hello world' isn't in 'world hello'

  • Part 3 (string2 not in string1):

    It evaluates to True, because 'world hello' is effectively not present in 'hello world'

  • Part2, or:

    The or will give you:

    • True if at least one of the expression evaluates to True
    • False if all the expression evaluate to False

So you get True

But if you have used and, you would have get False

If sometimes you are in doubt, try some print like these:

# or:
print(True or True) # True
print(True or False) # True
print(False or False) # False

# and:
print(True and True) # True
print(True and False) # False
print(False and False) # false

Answering your comment:

No, 'hello world' isn't in 'world hello'
So, what is in 'world hello' ?

  • 'world', 'hello', ' ' (space) and '' (the empty string).
  • And all possible substrings (characters and consecutive characters in the source
    string, so for example 'h', 'he', 'hel', ' wo', etc.). And note that all items in fact 'world', 'hello', ' ' and '' are substrings to :-)

So, all of this evaluates to true:

# string2 = 'world hello'
'world' in string2
'hello' in string2
' ' in string2
'' in string2
'h' in string2
'e' in string2
'llo' in string2
'llo 'wo' in string2
# etc.

In computer science, a string is a sequence of characters.
Each sub-sequence is a substring.

So now, you should have a better understanding of what is a string and what is a substring and you could/should search some informations on the internet if you're interested.

So, what does the in expression ?
The in expression, in fact, when working with strings, tells you if the character of the string you're searching in another string, is a substring of this string or not.

To conclude, the sequence of characters 'hello world' is not in the sequence of characters 'world hello'.

or' and 'and' evaluation in Python3x

print returns None (which is Falseish), so python has to evaluate the other operand of or, but not and, to get the answer.

  • return print('Hello') or print('Hello again')
  • Hello is printed, print returns None
  • return None or print('Hello again')
  • or returns True if any of the operands is True. If the first one is True, there's no need to evaluate the second one. This isn't the case
  • Hello again is printed
  • return None or None
  • now we are certain False should be returned.

 

  • return print('Hello') and print('Hello again')
  • Hello is printed, print returns None
  • return None and print('Hello again')
  • and returns True if both operands are True. If the first one is False, there's no need to evaluate the second one.
  • return False

Why does Python evaluate strings/numbers as True in if statements yet myNumber == True returns False?

You're testing different things here.

The if just checks if the bool of the expression (see also "Truth value testing") is True not if the identity is equal to True.

So what is actually tested by the if is:

>>> bool(5) == True
True

Is False == 0 and True == 1 an implementation detail or is it guaranteed by the language?

In Python 2.x this is not guaranteed as it is possible for True and False to be reassigned. However, even if this happens, boolean True and boolean False are still properly returned for comparisons.

In Python 3.x True and False are keywords and will always be equal to 1 and 0.

Under normal circumstances in Python 2, and always in Python 3:

False object is of type bool which is a subclass of int:

    object
|
int
|
bool

It is the only reason why in your example, ['zero', 'one'][False] does work. It would not work with an object which is not a subclass of integer, because list indexing only works with integers, or objects that define a __index__ method (thanks mark-dickinson).

Edit:

It is true of the current python version, and of that of Python 3. The docs for python 2 and the docs for Python 3 both say:

There are two types of integers: [...] Integers (int) [...] Booleans (bool)

and in the boolean subsection:

Booleans: These represent the truth values False and True [...] Boolean values behave like the values 0 and 1, respectively, in almost all contexts, the exception being that when converted to a string, the strings "False" or "True" are returned, respectively.

There is also, for Python 2:

In numeric contexts (for example when used as the argument to an arithmetic operator), they [False and True] behave like the integers 0 and 1, respectively.

So booleans are explicitly considered as integers in Python 2 and 3.

So you're safe until Python 4 comes along. ;-)



Related Topics



Leave a reply



Submit