How to Check If All Items in a List Are There in Another List

How to check if all of the following items are in a list?

Operators like <= in Python are generally not overriden to mean something significantly different than "less than or equal to". It's unusual for the standard library does this--it smells like legacy API to me.

Use the equivalent and more clearly-named method, set.issubset. Note that you don't need to convert the argument to a set; it'll do that for you if needed.

set(['a', 'b']).issubset(['a', 'b', 'c'])

How do I check if a list contains all the elements of another list in Python? It's not working

I think you want all:

list1 = ['Gryffindor', 'Ravenclaw', 'Hufflepuff', 'Slytherin']
list2 = ['Gryffindor', 'Ravenclaw']

checkif = all(item in list1 for item in list2)

Also, you need to swap list1 and list1 to get the result you describe in the print line.

How to check if all items in a list are there in another list?

When number of occurrences doesn't matter, you can still use the subset functionality, by creating a set on the fly:

>>> list1 = ['a', 'c', 'c']
>>> list2 = ['x', 'b', 'a', 'x', 'c', 'y', 'c']
>>> set(list1).issubset(list2)
True

If you need to check if each element shows up at least as many times in the second list as in the first list, you can make use of the Counter type and define your own subset relation:

>>> from collections import Counter
>>> def counterSubset(list1, list2):
c1, c2 = Counter(list1), Counter(list2)
for k, n in c1.items():
if n > c2[k]:
return False
return True

>>> counterSubset(list1, list2)
True
>>> counterSubset(list1 + ['a'], list2)
False
>>> counterSubset(list1 + ['z'], list2)
False

If you already have counters (which might be a useful alternative to store your data anyway), you can also just write this as a single line:

>>> all(n <= c2[k] for k, n in c1.items())
True

Checking if List contains all items from another list

You can use sets

t1 = [ (1,2), (3,4), (5,6), (7,8), (9,10), (11,12) ]
t2 = [ (3,4), (11,12) ]
set(t2).issubset(t1)
# returns true

# or equivalent use '<=' so
set(t2) <= set(t1)
# returns true

Check if all items in list are one of the items in another list

One way is using sets:

set(User_input).issubset(List_Permitted_Characters)

If this is all you are using List_Permitted_Characters for, you should store is as a set anyway, since the sequential information is irrelevant.

How do I efficiently find which elements of a list are in another list?

If you want to use a vector approach you can also use Numpy isin. It's not the fastest method, as demonstrated by oda's excellent post, but it's definitely an alternative to consider.

import numpy as np

list_1 = [0,0,1,2,0,0]
list_2 = [1,2,3,4,5,6]

a1 = np.array(list_1)
a2 = np.array(list_2)

np.isin(a1, a2)
# array([False, False, True, True, False, False])

Using any() and all() to check if a list contains one set of values or another

Generally speaking:

all and any are functions that take some iterable and return True, if

  • in the case of all, no values in the iterable are falsy;
  • in the case of any, at least one value is truthy.

A value x is falsy iff bool(x) == False.
A value x is truthy iff bool(x) == True.

Any non-boolean elements in the iterable are perfectly acceptable — bool(x) maps, or coerces, any x according to these rules:

  • 0, 0.0, None, [], (), [], set(), and other empty collections are mapped to False
  • all other values are mapped to True.

The docstring for bool uses the terms 'true'/'false' for 'truthy'/'falsy', and True/False for the concrete boolean values.


In your specific code samples:

You’ve slightly misunderstood how these functions work. The following does something completely different from what you thought:

if any(foobars) == big_foobar:

...because any(foobars) would first be evaluated to either True or False, and then that boolean value would be compared to big_foobar, which generally always gives you False (unless big_foobar coincidentally happened to be the same boolean value).

Note: the iterable can be a list, but it can also be a generator or a generator expression (≈ lazily evaluated/generated list), or any other iterator.

What you want instead is:

if any(x == big_foobar for x in foobars):

which basically first constructs an iterable that yields a sequence of booleans—for each item in foobars, it compares the item to the value held by big_foobar, and (lazily) emits the resulting boolean into the resulting sequence of booleans:

tmp = (x == big_foobar for x in foobars)

then any walks over all items in tmp and returns True as soon as it finds the first truthy element. It's as if you did the following:

In [1]: foobars = ['big', 'small', 'medium', 'nice', 'ugly']                                        

In [2]: big_foobar = 'big'

In [3]: any(['big' == big_foobar, 'small' == big_foobar, 'medium' == big_foobar, 'nice' == big_foobar, 'ugly' == big_foobar])
Out[3]: True

Note: As DSM pointed out, any(x == y for x in xs) is equivalent to y in xs but the latter is more readable, quicker to write and runs faster.

Some examples:

In [1]: any(x > 5 for x in range(4))
Out[1]: False

In [2]: all(isinstance(x, int) for x in range(10))
Out[2]: True

In [3]: any(x == 'Erik' for x in ['Erik', 'John', 'Jane', 'Jim'])
Out[3]: True

In [4]: all([True, True, True, False, True])
Out[4]: False

See also: http://docs.python.org/2/library/functions.html#all

How to check if all elements of a list match a condition?

The best answer here is to use all(), which is the builtin for this situation. We combine this with a generator expression to produce the result you want cleanly and efficiently. For example:

>>> items = [[1, 2, 0], [1, 2, 0], [1, 2, 0]]
>>> all(flag == 0 for (_, _, flag) in items)
True
>>> items = [[1, 2, 0], [1, 2, 1], [1, 2, 0]]
>>> all(flag == 0 for (_, _, flag) in items)
False

Note that all(flag == 0 for (_, _, flag) in items) is directly equivalent to all(item[2] == 0 for item in items), it's just a little nicer to read in this case.

And, for the filter example, a list comprehension (of course, you could use a generator expression where appropriate):

>>> [x for x in items if x[2] == 0]
[[1, 2, 0], [1, 2, 0]]

If you want to check at least one element is 0, the better option is to use any() which is more readable:

>>> any(flag == 0 for (_, _, flag) in items)
True

Python find elements in one list that are not in the other

TL;DR:

SOLUTION (1)

import numpy as np
main_list = np.setdiff1d(list_2,list_1)
# yields the elements in `list_2` that are NOT in `list_1`

SOLUTION (2) You want a sorted list

def setdiff_sorted(array1,array2,assume_unique=False):
ans = np.setdiff1d(array1,array2,assume_unique).tolist()
if assume_unique:
return sorted(ans)
return ans
main_list = setdiff_sorted(list_2,list_1)



EXPLANATIONS:

(1) You can use NumPy's setdiff1d (array1,array2,assume_unique=False).

assume_unique asks the user IF the arrays ARE ALREADY UNIQUE.
If False, then the unique elements are determined first.

If True, the function will assume that the elements are already unique AND function will skip determining the unique elements.

This yields the unique values in array1 that are not in array2. assume_unique is False by default.

If you are concerned with the unique elements (based on the response of Chinny84), then simply use (where assume_unique=False => the default value):

import numpy as np
list_1 = ["a", "b", "c", "d", "e"]
list_2 = ["a", "f", "c", "m"]
main_list = np.setdiff1d(list_2,list_1)
# yields the elements in `list_2` that are NOT in `list_1`


(2)
For those who want answers to be sorted, I've made a custom function:

import numpy as np
def setdiff_sorted(array1,array2,assume_unique=False):
ans = np.setdiff1d(array1,array2,assume_unique).tolist()
if assume_unique:
return sorted(ans)
return ans

To get the answer, run:

main_list = setdiff_sorted(list_2,list_1)

SIDE NOTES:

(a) Solution 2 (custom function setdiff_sorted) returns a list (compared to an array in solution 1).

(b) If you aren't sure if the elements are unique, just use the default setting of NumPy's setdiff1d in both solutions A and B. What can be an example of a complication? See note (c).

(c) Things will be different if either of the two lists is not unique.

Say list_2 is not unique: list2 = ["a", "f", "c", "m", "m"]. Keep list1 as is: list_1 = ["a", "b", "c", "d", "e"]
Setting the default value of assume_unique yields ["f", "m"] (in both solutions). HOWEVER, if you set assume_unique=True, both solutions give ["f", "m", "m"]. Why? This is because the user ASSUMED that the elements are unique). Hence, IT IS BETTER TO KEEP assume_unique to its default value. Note that both answers are sorted.

pythonnumpy



Related Topics



Leave a reply



Submit