Fastest Way to Search a List in Python

Fastest way to search a list in python

Also note that the list of values I'll have won't have duplicate data and I don't actually care about the order it's in; I just need to be able to check for the existence of a value.

Don't use a list, use a set() instead. It has exactly the properties you want, including a blazing fast in test.

I've seen speedups of 20x and higher in places (mostly heavy number crunching) where one list was changed for a set.

What's the fastest way to locate a list element within a list in python?

No. Without iterating you cannot find it, unless the list is already sorted. You can improve your code like this, with enumerate and list comprehension.

[index for index, item in enumerate(thelist) if item[0] == "332"]

This will give the indices of all the elements where the first element is 332.

If you know that 332 occurs only once, you can do this

def getIndex():
for index, item in enumerate(thelist):
if item[0] == "332":
return index

What is the fastes way to find an item in a list in python?

Numpy searchsorted is usually involved in these cases:

np.searchsorted([1,2,8,9], 5) # Your case
> 2

np.searchsorted([1,2,8,9], (-1, 2, 100)) #Other cases
> array([0, 1, 4])

index in missing cases refers to the near right. If this is not your case, this can be modified in order to obtain the near left position.

Most efficient way for a lookup/search in a huge list (python)

Don't create a list, create a set. It does lookups in constant time.

If you don't want the memory overhead of a set then keep a sorted list and search through it with the bisect module.

from bisect import bisect_left
def bi_contains(lst, item):
""" efficient `item in lst` for sorted lists """
# if item is larger than the last its not in the list, but the bisect would
# find `len(lst)` as the index to insert, so check that first. Else, if the
# item is in the list then it has to be at index bisect_left(lst, item)
return (item <= lst[-1]) and (lst[bisect_left(lst, item)] == item)

Fastest way to check if a list is present in a list of lists

Using a list comprehension with set.

Ex:

a=[[1,2,3,4,5,6],[7,8,9,10,11,12]]  
b=[[5, 9, 25, 31, 33, 36],[7,8,9,10,11,12],[10, 13, 22, 24, 33, 44]]

setA = set(map(tuple, a))
setB = set(map(tuple, b))

print([i for i in setA if i not in setB])

Output:

[(1, 2, 3, 4, 5, 6)]

Python: Fastest way to find all elements in one large list but not in another

I really like set analysis, where you can do:

set(list2) - set(list1)

Putting list items in a set removes all duplicates & ordering. Set operations allow us to remove a set of items from another set, just with the - operator.

If the list is enormous, numpy is a bit faster.

import numpy as np
np.setdiff1d(list1, list2)

Python searching a large list speed

Two things that might provide some small help:

1) Use the approach in this SO answer to read through your large file the most efficiently.

2) Change your code from

for x in headwordList:
m = SequenceMatcher(None, y.lower(), 1)

to

yLower = y.lower()
for x in headwordList:
m = SequenceMatcher(None, yLower, 1)

You're converting each sentence to lower 650,000 times. No need for that.

Searching a sorted list?

Python:

import bisect

def find_in_sorted_list(elem, sorted_list):
# https://docs.python.org/3/library/bisect.html
'Locate the leftmost value exactly equal to x'
i = bisect.bisect_left(sorted_list, elem)
if i != len(sorted_list) and sorted_list[i] == elem:
return i
return -1

def in_sorted_list(elem, sorted_list):
i = bisect.bisect_left(sorted_list, elem)
return i != len(sorted_list) and sorted_list[i] == elem

L = ["aaa", "bcd", "hello", "world", "zzz"]
print(find_in_sorted_list("hello", L)) # 2
print(find_in_sorted_list("hellu", L)) # -1
print(in_sorted_list("hellu", L)) # False

Fastest way to check if a value exists in a list

7 in a

Clearest and fastest way to do it.

You can also consider using a set, but constructing that set from your list may take more time than faster membership testing will save. The only way to be certain is to benchmark well. (this also depends on what operations you require)



Related Topics



Leave a reply



Submit