Python: Find in list
As for your first question: "if item is in my_list:
" is perfectly fine and should work if item
equals one of the elements inside my_list
. The item must exactly match an item in the list. For instance, "abc"
and "ABC"
do not match. Floating point values in particular may suffer from inaccuracy. For instance, 1 - 1/3 != 2/3
.
As for your second question: There's actually several possible ways if "finding" things in lists.
Checking if something is inside
This is the use case you describe: Checking whether something is inside a list or not. As you know, you can use the in
operator for that:
3 in [1, 2, 3] # => True
Filtering a collection
That is, finding all elements in a sequence that meet a certain condition. You can use list comprehension or generator expressions for that:
matches = [x for x in lst if fulfills_some_condition(x)]
matches = (x for x in lst if x > 6)
The latter will return a generator which you can imagine as a sort of lazy list that will only be built as soon as you iterate through it. By the way, the first one is exactly equivalent to
matches = filter(fulfills_some_condition, lst)
in Python 2. Here you can see higher-order functions at work. In Python 3, filter
doesn't return a list, but a generator-like object.
Finding the first occurrence
If you only want the first thing that matches a condition (but you don't know what it is yet), it's fine to use a for loop (possibly using the else
clause as well, which is not really well-known). You can also use
next(x for x in lst if ...)
which will return the first match or raise a StopIteration
if none is found. Alternatively, you can use
next((x for x in lst if ...), [default value])
Finding the location of an item
For lists, there's also the index
method that can sometimes be useful if you want to know where a certain element is in the list:
[1,2,3].index(2) # => 1
[1,2,3].index(4) # => ValueError
However, note that if you have duplicates, .index
always returns the lowest index:......
[1,2,3,2].index(2) # => 1
If there are duplicates and you want all the indexes then you can use enumerate()
instead:
[i for i,x in enumerate([1,2,3,2]) if x==2] # => [1, 3]
Finding the index of an item in a list
>>> ["foo", "bar", "baz"].index("bar")
1
Reference: Data Structures > More on Lists
Caveats follow
Note that while this is perhaps the cleanest way to answer the question as asked, index
is a rather weak component of the list
API, and I can't remember the last time I used it in anger. It's been pointed out to me in the comments that because this answer is heavily referenced, it should be made more complete. Some caveats about list.index
follow. It is probably worth initially taking a look at the documentation for it:
list.index(x[, start[, end]])
Return zero-based index in the list of the first item whose value is equal to x. Raises a
ValueError
if there is no such item.The optional arguments start and end are interpreted as in the slice notation and are used to limit the search to a particular subsequence of the list. The returned index is computed relative to the beginning of the full sequence rather than the start argument.
Linear time-complexity in list length
An index
call checks every element of the list in order, until it finds a match. If your list is long, and you don't know roughly where in the list it occurs, this search could become a bottleneck. In that case, you should consider a different data structure. Note that if you know roughly where to find the match, you can give index
a hint. For instance, in this snippet, l.index(999_999, 999_990, 1_000_000)
is roughly five orders of magnitude faster than straight l.index(999_999)
, because the former only has to search 10 entries, while the latter searches a million:
>>> import timeit
>>> timeit.timeit('l.index(999_999)', setup='l = list(range(0, 1_000_000))', number=1000)
9.356267921015387
>>> timeit.timeit('l.index(999_999, 999_990, 1_000_000)', setup='l = list(range(0, 1_000_000))', number=1000)
0.0004404920036904514
Only returns the index of the first match to its argument
A call to index
searches through the list in order until it finds a match, and stops there. If you expect to need indices of more matches, you should use a list comprehension, or generator expression.
>>> [1, 1].index(1)
0
>>> [i for i, e in enumerate([1, 2, 1]) if e == 1]
[0, 2]
>>> g = (i for i, e in enumerate([1, 2, 1]) if e == 1)
>>> next(g)
0
>>> next(g)
2
Most places where I once would have used index
, I now use a list comprehension or generator expression because they're more generalizable. So if you're considering reaching for index
, take a look at these excellent Python features.
Throws if element not present in list
A call to index
results in a ValueError
if the item's not present.
>>> [1, 1].index(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: 2 is not in list
If the item might not be present in the list, you should either
- Check for it first with
item in my_list
(clean, readable approach), or - Wrap the
index
call in atry/except
block which catchesValueError
(probably faster, at least when the list to search is long, and the item is usually present.)
Python: finding an element in a list
The best way is probably to use the list method .index.
For the objects in the list, you can do something like:
def __eq__(self, other):
return self.Value == other.Value
with any special processing you need.
You can also use a for/in statement with enumerate(arr)
Example of finding the index of an item that has value > 100.
for index, item in enumerate(arr):
if item > 100:
return index, item
Source
How to find the python list item that start with
You have several options, but most obvious are:
Using list comprehension with a condition:
result = [i for i in some_list if i.startswith('GFS01_')]
Using filter
(which returns iterator)
result = filter(lambda x: x.startswith('GFS01_'), some_list)
How do I get the number of elements in a list (length of a list) in Python?
The len()
function can be used with several different types in Python - both built-in types and library types. For example:
>>> len([1, 2, 3])
3
How to find all occurrences of an element in a list
You can use a list comprehension with enumerate
:
indices = [i for i, x in enumerate(my_list) if x == "whatever"]
The iterator enumerate(my_list)
yields pairs (index, item)
for each item in the list. Using i, x
as loop variable target unpacks these pairs into the index i
and the list item x
. We filter down to all x
that match our criterion, and select the indices i
of these elements.
Python find elements in one list that are not in the other
TL;DR:
SOLUTION (1)
import numpy as np
main_list = np.setdiff1d(list_2,list_1)
# yields the elements in `list_2` that are NOT in `list_1`
SOLUTION (2) You want a sorted list
def setdiff_sorted(array1,array2,assume_unique=False):
ans = np.setdiff1d(array1,array2,assume_unique).tolist()
if assume_unique:
return sorted(ans)
return ans
main_list = setdiff_sorted(list_2,list_1)
EXPLANATIONS:
(1) You can use NumPy's setdiff1d
(array1
,array2
,assume_unique
=False
).
assume_unique
asks the user IF the arrays ARE ALREADY UNIQUE.
If False
, then the unique elements are determined first.
If True
, the function will assume that the elements are already unique AND function will skip determining the unique elements.
This yields the unique values in array1
that are not in array2
. assume_unique
is False
by default.
If you are concerned with the unique elements (based on the response of Chinny84), then simply use (where assume_unique=False
=> the default value):
import numpy as np
list_1 = ["a", "b", "c", "d", "e"]
list_2 = ["a", "f", "c", "m"]
main_list = np.setdiff1d(list_2,list_1)
# yields the elements in `list_2` that are NOT in `list_1`
(2)
For those who want answers to be sorted, I've made a custom function:
import numpy as np
def setdiff_sorted(array1,array2,assume_unique=False):
ans = np.setdiff1d(array1,array2,assume_unique).tolist()
if assume_unique:
return sorted(ans)
return ans
To get the answer, run:
main_list = setdiff_sorted(list_2,list_1)
SIDE NOTES:
(a) Solution 2 (custom function setdiff_sorted
) returns a list (compared to an array in solution 1).
(b) If you aren't sure if the elements are unique, just use the default setting of NumPy's setdiff1d
in both solutions A and B. What can be an example of a complication? See note (c).
(c) Things will be different if either of the two lists is not unique.
Say list_2
is not unique: list2 = ["a", "f", "c", "m", "m"]
. Keep list1
as is: list_1 = ["a", "b", "c", "d", "e"]
Setting the default value of assume_unique
yields ["f", "m"]
(in both solutions). HOWEVER, if you set assume_unique=True
, both solutions give ["f", "m", "m"]
. Why? This is because the user ASSUMED that the elements are unique). Hence, IT IS BETTER TO KEEP assume_unique
to its default value. Note that both answers are sorted.
pythonnumpy
Find object in list that has attribute equal to some value (that meets any condition)
next((x for x in test_list if x.value == value), None)
This gets the first item from the list that matches the condition, and returns None
if no item matches. It's my preferred single-expression form.
However,
for x in test_list:
if x.value == value:
print("i found it!")
break
The naive loop-break version, is perfectly Pythonic -- it's concise, clear, and efficient. To make it match the behavior of the one-liner:
for x in test_list:
if x.value == value:
print("i found it!")
break
else:
x = None
This will assign None
to x
if you don't break
out of the loop.
How do I efficiently find which elements of a list are in another list?
If you want to use a vector approach you can also use Numpy isin. It's not the fastest method, as demonstrated by oda's excellent post, but it's definitely an alternative to consider.
import numpy as np
list_1 = [0,0,1,2,0,0]
list_2 = [1,2,3,4,5,6]
a1 = np.array(list_1)
a2 = np.array(list_2)
np.isin(a1, a2)
# array([False, False, True, True, False, False])
Related Topics
Differencebetween Dict.Items() and Dict.Iteritems() in Python2
How to Create a Constant in Python
What Is the Fastest Way to Send 100,000 Http Requests in Python
How to Profile Memory Usage in Python
Http Requests and JSON Parsing in Python
Find Full Path of the Python Interpreter
How to Make a Timezone Aware Datetime Object
Matplotlib Different Size Subplots
Why Can't I Use a List as a Dict Key in Python
How to Redirect Output with Subprocess in Python
How to Get the Ascii Value of a Character
Pythonic Way to Print List Items
How to Run Functions in Parallel
What's the Correct Way to Convert Bytes to a Hex String in Python 3
Dump a Numpy Array into a CSV File
Python Dictionary: Are Keys() and Values() Always the Same Order