How to Check If a String Is a Substring of Items in a List of Strings

How to check if a string is a substring of items in a list of strings

To check for the presence of 'abc' in any string in the list:

xs = ['abc-123', 'def-456', 'ghi-789', 'abc-456']

if any("abc" in s for s in xs):
...

To get all the items containing 'abc':

matching = [s for s in xs if "abc" in s]

How to check for the presence of a substring in a list of strings

You have a numpy array, not a list.

Anyway, considering a list (this would also work on a numpy array):

my_lst = ['hello', 'this', 'is', 'a', 'testello']

query = 'ello'
out = [query in e for e in my_lst]

# [True, False, False, False, True]

for a numpy array:

my_array = np.array(['hello', 'this', 'is', 'a', 'testello'])

out = np.core.defchararray.find(my_array, query)>0
# array([ True, False, False, False, True])

Finding a substring within a list in Python

print [s for s in list if sub in s]

If you want them separated by newlines:

print "\n".join(s for s in list if sub in s)

Full example, with case insensitivity:

mylist = ['abc123', 'def456', 'ghi789', 'ABC987', 'aBc654']
sub = 'abc'

print "\n".join(s for s in mylist if sub.lower() in s.lower())

How to check if a string contains an element from a list in Python

Use a generator together with any, which short-circuits on the first True:

if any(ext in url_string for ext in extensionsToCheck):
print(url_string)

EDIT: I see this answer has been accepted by OP. Though my solution may be "good enough" solution to his particular problem, and is a good general way to check if any strings in a list are found in another string, keep in mind that this is all that this solution does. It does not care WHERE the string is found e.g. in the ending of the string. If this is important, as is often the case with urls, you should look to the answer of @Wladimir Palant, or you risk getting false positives.

Check if substring is in a list of strings?

You can import any from __builtin__ in case it was replaced by some other any:

>>> from  __builtin__ import any as b_any
>>> lst = ['yellow', 'orange', 'red']
>>> word = "or"
>>> b_any(word in x for x in lst)
True

Note that in Python 3 __builtin__ has been renamed to builtins.

Check if string in list contain all the elements from another list

Try this:

list1 = ['banana.two', 'apple.three', 'raspberry.six']
list2 = ['two', 'three']


def check(strings, substrings):
for substring in substrings:
if not (any(substring in string for string in strings)):
return False
return True


print(check(list1, list2))

Python: How to check a string for substrings from a list?

Try this test:

any(substring in string for substring in substring_list)

It will return True if any of the substrings in substring_list is contained in string.

Note that there is a Python analogue of Marc Gravell's answer in the linked question:

from itertools import imap
any(imap(string.__contains__, substring_list))

In Python 3, you can use map directly instead:

any(map(string.__contains__, substring_list))

Probably the above version using a generator expression is more clear though.

Fastest way to check whether a string is a substring in a list of strings

You could look into turning your list of names into one regular expression. Take for example this tiny list of names:

names = ['AARON',
'ABDUL',
'ABE',
'ABEL',
'ABRAHAM',
'ABRAM',
'ADALBERTO',
'ADAM',
'ADAN',
'ADOLFO',
'ADOLPH',
'ADRIAN',
]

That could be represented with the following regular expression:

\b(?:AARON|ABDUL|ABE|ABEL|ABRAHAM|ABRAM|ADALBERTO|ADAM|ADAN|ADOLFO|ADOLPH|ADRIAN)\b

But that would not be very efficient. A regular expression that is built like a tree will work better:

\b(?:A(?:B(?:E(?:|L)|RA(?:M|HAM)|DUL)|D(?:A(?:M|N|LBERTO)|OL(?:FO|PH)|RIAN)|ARON))\b

You could then automate the production of this regular expression -- possibly by first creating a dict-tree structure from the list of names, and then translating that tree into a regular expression. For the above example, that intermediate tree would look like this:

{
'A': {
'A': {
'R': {
'O': {
'N': {
'': {}
}
}
}
},
'B': {
'D': {
'U': {
'L': {
'': {}
}
}
},
'E': {
'': {},
'L': {
'': {}
}
},
... etc

... which could optionally be simplified to this:

{
'A': {
'ARON': {
'': {}
}
'B': {
'DUL': {
'': {}
},
'E': {
'': {},
'L': {
'': {}
}
},
'RA': {
'HAM': {
'': {}
},
'M': {
'': {}
}
}
},

... etc

Here is the suggested code to do this:

import re 

def addToTree(tree, name):
if len(name) == 0:
return
if name[0] in tree.keys():
addToTree(tree[name[0]], name[1:])
else:
for letter in name:
tree[letter] = {}
tree = tree[letter]
tree[''] = {}

# Optional improvement of the tree: it combines several consecutive letters into
# one key if there are no alternatives possible
def simplifyTree(tree):
repeat = True
while repeat:
repeat = False
for key, subtree in list(tree.items()):
if key != '' and len(subtree) == 1 and '' not in subtree.keys():
for letter, subsubtree in subtree.items():
tree[key + letter] = subsubtree
del tree[key]
repeat = True
for key, subtree in tree.items():
if key != '':
simplifyTree(subtree)

def treeToRegExp(tree):
regexp = [re.escape(key) + treeToRegExp(subtree) for key, subtree in tree.items()]
regexp = '|'.join(regexp)
return '' if regexp == '' else '(?:' + regexp + ')'

def listToRegExp(names):
tree = {}
for name in names:
addToTree(tree, name[:])
simplifyTree(tree)
return re.compile(r'\b' + treeToRegExp(tree) + r'\b', re.I)

# Demo
names = ['AARON',
'ABDUL',
'ABE',
'ABEL',
'ABRAHAM',
'ABRAM',
'ADALBERTO',
'ADAM',
'ADAN',
'ADOLFO',
'ADOLPH',
'ADRIAN',
]

fields = [
'This is Aaron speaking',
'Is Abex a name?',
'Where did Abraham get the mustard from?'
]

regexp = listToRegExp(names)
# get the search result for each field, and link it with the index of the field
results = [[i, regexp.search(field)] for i, field in enumerate(fields)]
# remove non-matches from the results
results = [[i, match.group(0)] for [i, match] in results if match]
# print results
print(results)

See it run on repl.it

Determine if List of Strings contains substring of all Strings in other list

Not sure if there is any way of doing that without iterating over a the same amount as b.size. Because if you only want 1 iteration of a, you will have to check all the elements on b and now you are iterating over b a.size times plus, in this scenario, you also need to keep track of which item in b already had a match, and not check them again, which might be worse than just iterating over a, since you can only do that by either removing them from the list (or a copy, which you use instead of b), or by using another list to keep track of the matches, then compare that to the original b.

So I think that you are on the right track with your code there, but there are some issues. For example you don't have any reference to b, just hardcoded strings, and doing it like that for all elements in b will result in quite a big function if you have more than 2, or better yet, if you don't already know the values.

This code will do the same thing as the one you put above, but it will actually use elements from b, and not hardcoded strings that match b. (it will iterate over b b.size times, and partially over a b.size times)

return b.all { bItem ->
a.any { it.contains(bItem) }
}

How can I check if a string has a substring from a List?

I would recommend iterating over the entire list. Thankfully, you can use an enhanced for loop:

for(String listItem : myArrayList){
if(myString.contains(listItem)){
// do something.
}
}

EDIT To the best of my knowledge, you have to iterate the list somehow. Think about it, how will you know which elements are contained in the list without going through it?

EDIT 2

The only way I can see the iteration running quickly is to do the above. The way this is designed, it will break early once you've found a match, without searching any further. You can put your return false statement at the end of looping, because if you have checked the entire list without finding a match, clearly there is none. Here is some more detailed code:

public boolean containsAKeyword(String myString, List<String> keywords){
for(String keyword : keywords){
if(myString.contains(keyword)){
return true;
}
}
return false; // Never found match.
}

EDIT 3

If you're using Kotlin, you can do this with the any method:

val containsKeyword = myArrayList.any { it.contains("keyword") }


Related Topics



Leave a reply



Submit