Why Is True Returned When Checking If an Empty String Is in Another

Why is True returned when checking if an empty string is in another?

From the documentation:

For the Unicode and string types, x in y is true if and only if x is a substring of y. An equivalent test is y.find(x) != -1. Note, x and y need not be the same type; consequently, u'ab' in 'abc' will return True. Empty strings are always considered to be a substring of any other string, so "" in "abc" will return True.

From looking at your print call, you're using 2.x.

To go deeper, look at the bytecode:

>>> def answer():
... '' in 'lolsome'

>>> dis.dis(answer)
2 0 LOAD_CONST 1 ('')
3 LOAD_CONST 2 ('lolsome')
6 COMPARE_OP 6 (in)
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE

COMPARE_OP is where we are doing our boolean operation and looking at the source code for in reveals where the comparison happens:

    TARGET(COMPARE_OP)
{
w = POP();
v = TOP();
if (PyInt_CheckExact(w) && PyInt_CheckExact(v)) {
/* INLINE: cmp(int, int) */
register long a, b;
register int res;
a = PyInt_AS_LONG(v);
b = PyInt_AS_LONG(w);
switch (oparg) {
case PyCmp_LT: res = a < b; break;
case PyCmp_LE: res = a <= b; break;
case PyCmp_EQ: res = a == b; break;
case PyCmp_NE: res = a != b; break;
case PyCmp_GT: res = a > b; break;
case PyCmp_GE: res = a >= b; break;
case PyCmp_IS: res = v == w; break;
case PyCmp_IS_NOT: res = v != w; break;
default: goto slow_compare;
}
x = res ? Py_True : Py_False;
Py_INCREF(x);
}
else {
slow_compare:
x = cmp_outcome(oparg, v, w);
}
Py_DECREF(v);
Py_DECREF(w);
SET_TOP(x);
if (x == NULL) break;
PREDICT(POP_JUMP_IF_FALSE);
PREDICT(POP_JUMP_IF_TRUE);
DISPATCH();
}

and where cmp_outcome is in the same file, it's easy to find our next clue:

res = PySequence_Contains(w, v);

which is in abstract.c:

{
Py_ssize_t result;
if (PyType_HasFeature(seq->ob_type, Py_TPFLAGS_HAVE_SEQUENCE_IN)) {
PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;
if (sqm != NULL && sqm->sq_contains != NULL)
return (*sqm->sq_contains)(seq, ob);
}
result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);
return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);
}

and to come up for air from the source, we find this next function in the documentation:

objobjproc PySequenceMethods.sq_contains

This function may be used by PySequence_Contains() and has the same signature. This slot may be left to NULL, in this case PySequence_Contains() simply traverses the sequence until it finds a match.

and further down in the same documentation:

int PySequence_Contains(PyObject *o, PyObject *value)

Determine if o contains value. If an item in o is equal to value, return 1, otherwise return 0. On error, return -1. This is equivalent to the Python expression value in o.

Where '' isn't null, the sequence 'lolsome' can be thought to contain it.

How does checking if the empty string is in a word evaluate to true in python?

Because the in operator is to test membership, and by that notion, the empty set is a subset of all other sets. To be consistent in that regard, the in operator was deliberately designed to act accordingly.

5.9 Comparisons

The operators in and not in test for collection membership. x in s evaluates to true if x is a member of the collection s, and false otherwise. x not in s returns the negation of x in s. The collection membership test has traditionally been bound to sequences; an object is a member of a collection if the collection is a sequence and contains an element equal to that object. However, it make sense for many other object types to support membership tests without being a sequence. In particular, dictionaries (for keys) and sets support membership testing.

For the list and tuple types, x in y is true if and only if there exists an index i such that x == y[i] is true.

For the Unicode and string types, x in y is true if and only if x is a substring of y. An equivalent test is y.find(x) != -1. Note, x and y need not be the same type; consequently, u'ab' in 'abc' will return True. Empty strings are always considered to be a substring of any other string, so "" in "abc" will return True.

Why does finding empty string return 0?

As string.find(s, sub[, start[, end]]) document says:

Return the lowest index in s where the substring sub is found such that sub is wholly contained in s[start:end].

The way Python interpreter performs the search is by looking for your substring from the start index of the string and going till the end index.

It iterates and compares your substring as:

your_string[i:i+len(substring)]
# where `i` is the index

for your case, your substring is having 0 length, and with string slicing with 0th index at end, it returned empty string. For example:

>>> your_string = 'abcd'
>>> your_string[0:0]
''

Hence, when you executed str.find with empty string on it, you got the result as 0

>>> your_string.find('')
0

# Because your_string[0:0] == ''
# ^ this being the index returned

not in identity operator not working when checking empty string for certain characters

An empty string is present in any string. Therefore your condition, difficulty not in 'EMH' will evaluate to False when difficulty equals ''; so the while loop's body won't be executed.

In [24]: '' not in 'EMH'                                                                                                                                  
Out[24]: False

In [33]: '' in 'EMH'
Out[33]: True

A better approach might be to convert the string EMH to a list via list('EMH') so that something like EM or EH, or a empty character doesn't break your loop, or avoid it from starting in the first place

Also as @Blckknght suggested, a better alternative is use a default value of None for difficulty.

In [3]: difficulty = None                                                                                                                                

In [4]: while difficulty not in list('EMH'):
...: print('Enter difficulty: E - Easy, M - Medium, H - Hard')
...: difficulty = input().upper()
...:
Enter difficulty: E - Easy, M - Medium, H - Hard
A
Enter difficulty: E - Easy, M - Medium, H - Hard
B
Enter difficulty: E - Easy, M - Medium, H - Hard
C
Enter difficulty: E - Easy, M - Medium, H - Hard
EM
Enter difficulty: E - Easy, M - Medium, H - Hard
E

In [5]:

Why a function checking if a string is empty always returns true?

Simple problem actually. Change:

if (strTemp != '')

to

if ($strTemp != '')

Arguably you may also want to change it to:

if ($strTemp !== '')

since != '' will return true if you pass is numeric 0 and a few other cases due to PHP's automatic type conversion.

You should not use the built-in empty() function for this; see comments and the PHP type comparison tables.

Why does contains() method find empty string in non-empty string in Java

An empty string occurs in every string. Specifically, a contiguous subset of the string must match the empty string. Any empty subset is contiguous and any string has an empty string as such a subset.

Returns true if and only if this string contains the specified sequence of char values.

An empty set of char values exists in any string, at the beginning, end, and between characters. Out of anything, you can extract nothing. From a physical piece of string/yarn I can say that a zero-length portion exists within it.

If contains returns true there is a possible substring( invocation to get the string to find. "aaa".substring(1,1) should return "", but don't quote me on that as I don't have an IDE at the moment.



Related Topics



Leave a reply



Submit