Check for Presence of a Sliced List in Python

Check for presence of a sliced list in Python

If you are sure that your inputs will only contain the single digits 0 and 1 then you can convert to strings:

def sublistExists(list1, list2):
return ''.join(map(str, list2)) in ''.join(map(str, list1))

This creates two strings so it is not the most efficient solution but since it takes advantage of the optimized string searching algorithm in Python it's probably good enough for most purposes.

If efficiency is very important you can look at the Boyer-Moore string searching algorithm, adapted to work on lists.

A naive search has O(n*m) worst case but can be suitable if you cannot use the converting to string trick and you don't need to worry about performance.

Check for presence of a sublist in list with order kept, and allow wildcards

One way is to use all when checking against sublists and an if that skips asterisks:

def my_func(a_list, sub_list):
n = len(sub_list)

# list-level comparison is now via element-wise
return any(all(sub_item == chunk_item
for sub_item, chunk_item in zip(sub_list, a_list[i:i+n])
if not np.isnan(sub_item)) # "is_not_asterisk" condition
for i in range(len(a_list)-n+1))

where I used not np.isnan(...) as the asterisk condition as mentioned in the question; but it could be many things: e.g., if asterisk is literally "*" in sublists, then the condition there is changed to if sub_item != "*".

sample with np.nan as asterisk:

for a_list in my_lists:
if my_func(a_list, [np.nan, np.nan, 0, 0, 0, 1, np.nan]):
print(a_list)

gives

[1, 1, 1, 1, 0, 2, 1, 2, 1, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 1, 0]
[1, 0, 0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 0]

all returns True if the iterable is empty so if an all-asterisk sub-list is passed, it will return True for all candidates (as long as their length permits, which any will handle because any with empty iterable is False!).

How to check subsequence exists in a list?

x_in_y() can be implemented by slicing the original list and comparing the slices to the input list:

def x_in_y(query, base):
try:
l = len(query)
except TypeError:
l = 1
query = type(base)((query,))

for i in range(len(base)):
if base[i:i+l] == query:
return True
return False

Change range to xrange if you are using Python2.

How to test if a list contains another list as a contiguous subsequence?

Here is my version:

def contains(small, big):
for i in xrange(len(big)-len(small)+1):
for j in xrange(len(small)):
if big[i+j] != small[j]:
break
else:
return i, i+len(small)
return False

It returns a tuple of (start, end+1) since I think that is more pythonic, as Andrew Jaffe points out in his comment. It does not slice any sublists so should be reasonably efficient.

One point of interest for newbies is that it uses the else clause on the for statement - this is not something I use very often but can be invaluable in situations like this.

This is identical to finding substrings in a string, so for large lists it may be more efficient to implement something like the Boyer-Moore algorithm.

Note: If you are using Python3, change xrange to range.

How to check if first list appear in the same order somewhere in the second list

Try this:

def sublist(list1, list2):
n = len(list1)
return any((list1 == list2[i:i + n]) for i in range(len(list2) - n + 1))

print sublist([2, 3], [1, 2, 3, 4, 5])
print sublist([1, 3], [1, 2, 3, 4, 5])
print sublist([1, 2, 3], [1, 2, 1, 2, 3, 4])

We need to know the length of the sublist, that's what n is for.

Then the any() will stop after the first match in the iterable. The part inside the any() is a generator, we're comparing the sublist (list1) with each of the sublists in list2, starting from the beginning up to the end.

i will range from 0 up to the length of the second list, but you subtract out the length of the first list and then add 1, so you don't go over the end of the index.

Fastest way to check if a value exists in a list

7 in a

Clearest and fastest way to do it.

You can also consider using a set, but constructing that set from your list may take more time than faster membership testing will save. The only way to be certain is to benchmark well. (this also depends on what operations you require)

Python check next three elements in list

It looks like you're searching a sequence in a list.

You can just compare parts of the list with the sequence.

def find_sequence_in_list(list_to_check, values):
for i in range (len(list_to_check) - len(values) + 1):
#print(list_to_check[i:i + len(values)])
if list_to_check[i:i + len(values)] == values:
return True

return False

values = [1, 2, 3]
data1 = [1, 1, 2, 3, 1]
data2 = [1, 1, 4, 3, 1, 2, 1]

print(find_sequence_in_list(data1, values))
print(find_sequence_in_list(data2, values))

Uncomment the print to see what's happening.

Fastest way to determine if an ordered sublist is in a large lists of lists?

Unpack the item in the n-sized subsequence into n variables. Then write a list comprehension to filter the list doing a check for a, b or b, a in the sub-list.e.g.

a, b = sublst

def checklst(lst, a, b):
l = len(lst)
start = 0
while True:
try:
a_index = lst.index(a, start)
except ValueError:
return False
try:
return a_index > -1 and lst[a_index+1] == b
except IndexError:
try:
return a_index > -1 and lst[a_index-1] == b
except IndexError:
start = a_index + 1
if start == l:
return False
continue # keep looking at the next a

%timeit found = [l for l in lst if checklst(l, a, b)]
1.88 s ± 31.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit found = [x for x in lst if (a in x and not check_if_list_is_sublist(x, [a,b]))]
22.1 s ± 1.67 s per loop (mean ± std. dev. of 7 runs, 1 loop each)


Related Topics



Leave a reply



Submit