How to Check If a String Contains an Element from a List in Python

How to check if a string contains an element from a list in Python

Use a generator together with any, which short-circuits on the first True:

if any(ext in url_string for ext in extensionsToCheck):
print(url_string)

EDIT: I see this answer has been accepted by OP. Though my solution may be "good enough" solution to his particular problem, and is a good general way to check if any strings in a list are found in another string, keep in mind that this is all that this solution does. It does not care WHERE the string is found e.g. in the ending of the string. If this is important, as is often the case with urls, you should look to the answer of @Wladimir Palant, or you risk getting false positives.

How to check if a list (string) contains another list (string) considering order

I believe that this answer should work if you just don't remove things from the sublist that aren't in the test list. So for the case of the first method provided there

def contains(testList, subList):
shared = [x for x in testList if x in subList]
return shared == subList

You could also convert the sublist to work with non-list inputs.

def contains(testList, subList):
shared = [x for x in testList if x in subList]
return shared == list(subList)

Fastest way to check if a string contains a string from a list

For this I'd suggest firstly tokenize the string with RegexpTokenizer to remove all special characters and then use sets to find the intersection:

from nltk.tokenize import RegexpTokenizer
test_string = "Hello! This is a test. I love to eat apples."

tokenizer = RegexpTokenizer(r'\w+')
test_set = set(tokenizer.tokenize(test_string))
# {'Hello', 'I', 'This', 'a', 'apples', 'eat', 'is', 'love', 'test', 'to'}

Having tokenized the string and constructed a set find the set.intersection:

set(['apples', 'oranges', 'bananas']) & test_set
# {'apples'}

Find if a string contains a phrase from a list in python

Does this solve the issue:

my_phrases = ['Hello world', 'apple', 'orange', 'red car']

my_string = 'I am driving a red car'

for phrase in my_phrases:
if phrase in my_string:
print(phrase)

How to check if a string is a substring of items in a list of strings

To check for the presence of 'abc' in any string in the list:

xs = ['abc-123', 'def-456', 'ghi-789', 'abc-456']

if any("abc" in s for s in xs):
...

To get all the items containing 'abc':

matching = [s for s in xs if "abc" in s]

How to check if any item in a list contains a string in Python?

Please be noted that It might as well be a NLP problem, but my solutions are not.

If you are planning to check if members of your list are in the string, it should be pretty straight forward.

[i for i in item_list if i in String_text]
... ['Manpower Outsourcing']

This will keep only the list members that were in the string, but note it will only keep "exact matches".

If this output is not suitable for your purpose, there might be several other ways you can check.

Mark as 1 for members that were in the string, but 0 for others.

[1 if i in String_text else 0 for i in item_list]
... [0, 1, 0, 0, 0, 0, 0, 0]

Or if you would like to check how much for each members were in the string, I recommend splitting them.

item_list2 = [i.split(" ") for i in item_list]
[sum([1 if i in String_text else 0 for i in x])/len(x) for x in item_list2]
... [1.0, 1.0, 0.0, 0.0, 0.25, 0.0, 0.0, 0.6666666666666666]

You will notice the last one have different output from the formers because the first member "Manpower Service" is present seperately in the string as "Manpower" and "Service". You can choose the suitable solution for your purpose.

Again, please be noted that this might be a NLP problem and my solutions are just dumb strings matching.

Check if string contains any elements from list

>>> item_list = ["Non-Tradable Ubersaw", "Screamin' Eagle", "'Non-Craftable Spy-cicle"]
>>> not_allowed = {"Non-Tradable", "Non-Craftable"}

You can use a list comprehension with any to check if any of the disallowed substrings are in the current element

>>> filtered = [i for i in item_list if not any(stop in i for stop in not_allowed)]
>>> filtered
["Screamin' Eagle"]

Check if string in list contain all the elements from another list

Try this:

list1 = ['banana.two', 'apple.three', 'raspberry.six']
list2 = ['two', 'three']

def check(strings, substrings):
for substring in substrings:
if not (any(substring in string for string in strings)):
return False
return True

print(check(list1, list2))

How to check for the presence of a substring in a list of strings

You have a numpy array, not a list.

Anyway, considering a list (this would also work on a numpy array):

my_lst = ['hello', 'this', 'is', 'a', 'testello']

query = 'ello'
out = [query in e for e in my_lst]

# [True, False, False, False, True]

for a numpy array:

my_array = np.array(['hello', 'this', 'is', 'a', 'testello'])

out = np.core.defchararray.find(my_array, query)>0
# array([ True, False, False, False, True])

Check if a string contains the list elements

If no overlap is allowed, this problem becomes much harder than it looks at first.
As far as I can tell, no other answer is correct (see test cases at the end).

Recursion is needed because if a substring appears more than once, using one occurence instead of the other could prevent other substrings to be found.

This answer uses two functions. The first one finds every occurence of a substring in a string and returns an iterator of strings where the substring has been replaced by a character which shouldn't appear in any substring.

The second function recursively checks if there's any way to find all the numbers in the string:

def find_each_and_replace_by(string, substring, separator='x'):
"""
list(find_each_and_replace_by('8989', '89', 'x'))
# ['x89', '89x']
list(find_each_and_replace_by('9999', '99', 'x'))
# ['x99', '9x9', '99x']
list(find_each_and_replace_by('9999', '89', 'x'))
# []
"""
index = 0
while True:
index = string.find(substring, index)
if index == -1:
return
yield string[:index] + separator + string[index + len(substring):]
index += 1

def contains_all_without_overlap(string, numbers):
"""
contains_all_without_overlap("45892190", [89, 90])
# True
contains_all_without_overlap("45892190", [89, 90, 4521])
# False
"""
if len(numbers) == 0:
return True
substrings = [str(number) for number in numbers]
substring = substrings.pop()
return any(contains_all_without_overlap(shorter_string, substrings)
for shorter_string in find_each_and_replace_by(string, substring, 'x'))

Here are the test cases:

tests = [
("45892190", [89, 90], True),
("8990189290", [89, 90, 8990], True),
("123451234", [1234, 2345], True),
("123451234", [2345, 1234], True),
("123451234", [1234, 2346], False),
("123451234", [2346, 1234], False),
("45892190", [89, 90, 4521], False),
("890", [89, 90], False),
("8989", [89, 90], False),
("8989", [12, 34], False)
]

for string, numbers, should in tests:
result = contains_all_without_overlap(string, numbers)
if result == should:
print("Correct answer for %-12r and %-14r (%s)" % (string, numbers, result))
else:
print("ERROR : %r and %r should return %r, not %r" %
(string, numbers, should, result))

And the corresponding output:

Correct answer for '45892190'   and [89, 90]       (True)
Correct answer for '8990189290' and [89, 90, 8990] (True)
Correct answer for '123451234' and [1234, 2345] (True)
Correct answer for '123451234' and [2345, 1234] (True)
Correct answer for '123451234' and [1234, 2346] (False)
Correct answer for '123451234' and [2346, 1234] (False)
Correct answer for '45892190' and [89, 90, 4521] (False)
Correct answer for '890' and [89, 90] (False)
Correct answer for '8989' and [89, 90] (False)
Correct answer for '8989' and [12, 34] (False)


Related Topics



Leave a reply



Submit