Check List of Words in Another String

Check list of words in another string

if any(word in 'some one long two phrase three' for word in list_):

Python: how to determine if a list of words exist in a string

This function was found by Peter Gibson (below) to be the most performant of the answers here. It is good for datasets one may hold in memory (because it creates a list of words from the string to be searched and then a set of those words):

def words_in_string(word_list, a_string):
return set(word_list).intersection(a_string.split())

Usage:

my_word_list = ['one', 'two', 'three']
a_string = 'one two three'
if words_in_string(my_word_list, a_string):
print('One or more words found!')

Which prints One or words found! to stdout.

It does return the actual words found:

for word in words_in_string(my_word_list, a_string):
print(word)

Prints out:

three
two
one

For data so large you can't hold it in memory, the solution given in this answer would be very performant.

Check if multiple strings exist in another string

You can use any:

a_string = "A string is more than its parts!"
matches = ["more", "wholesome", "milk"]

if any(x in a_string for x in matches):

Similarly to check if all the strings from the list are found, use all instead of any.

How to check if a word or group of words exist in given list of strings and how to extract that word?

You could use re.findall so there's no nested loop.

output = {}
find_words = re.compile('|'.join(list_of_words)).findall
for i, (s,) in enumerate(map(dict.values, data['dict_sentences']), 1):
words = find_words(s.lower())
if words:
output[f"sent{i}"] = words


{'sent1': ['emmanuel college', 'churchill college'],
'sent2': ['emmanuel college'],
'sent3': ['holy trinity church']}

This can be done in a dict_comprehension as well using the walrus operator in python 3.8+ although may be a little overboard:

find_sent = re.compile('|'.join(list_of_words)).findall
iter_sent = enumerate(map(dict.values, data['dict_sentences']), 1)
output = {f"sent{i}": words for i, (s,) in iter_sent if (words := find_sent(s.lower()))}

How to check if a string contains an element from a list in Python

Use a generator together with any, which short-circuits on the first True:

if any(ext in url_string for ext in extensionsToCheck):
print(url_string)

EDIT: I see this answer has been accepted by OP. Though my solution may be "good enough" solution to his particular problem, and is a good general way to check if any strings in a list are found in another string, keep in mind that this is all that this solution does. It does not care WHERE the string is found e.g. in the ending of the string. If this is important, as is often the case with urls, you should look to the answer of @Wladimir Palant, or you risk getting false positives.

Check wether words from a list are inside a string of another list Python

In general, I recommend putting more thought into naming variables. I like how you tried to print the story headings. The line if any(lijst in s for s in a) does not do what you think it should: you need to instead iterate over each word in a single h2. The any function is just a short hand for the following:

def any(iterable):
for element in iterable:
if element:
return True
return False

In other words, you're trying to see if an entire list is in an h2 element, which will never be true. Here is an example fix.

import requests
from bs4 import BeautifulSoup

url = 'https://www.nytimes.com'
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
h2s = soup.findAll("h2", class_="esl82me0")

for story_heading in h2s:
print(story_heading.contents[0])

keywords = ["trump", "Trump", "Corona", "COVID", "virus", "Virus", "Coronavirus", "COVID-19"]
number = 0
run = 0

for h2 in h2s:
headline = h2.text
words_in_headline = headline.split(" ")
for word in words_in_headline:
if word in keywords:
number += 1
print("\nTrump or the Corona virus have been mentioned", number, "times.")

Output

Trump or the Corona virus have been mentioned 7 times.

Python - Search for a match for a word in a list of words in a string

In python you can easily check if a string contains another by using the in operator.

Just check the string for each keyword, and remember to make the case the same.

You can use one line

[print(x) for x in keywords if x in text.upper()]

or multiple

for x in keywords:
if x in text.upper():
print(x)

In your case the following example will output PYTHON:

text     = "The best language in the word is $python at now"
keywords = ["PYTHON","PHP","JAVA","COBOL","CPP","VB","HTML"]

[print(x) for x in keywords if x in text.upper()] #PYTHON

Have a nice day.

edit

As Malo indicated, i might be better style to pass the output to a variable and then print it after.

text     = "The best language in the word is $python at now"
keywords = ["PYTHON","PHP","JAVA","COBOL","CPP","VB","HTML"]

matches = [x for x in keywords if x in text.upper()]

for x in matches: print(x) # PYTHON

Check if a word is in a string in Python

What is wrong with:

if word in mystring: 
print('success')

Check if string is valid based on list of words

I didn't see it at first, but there's a very similar way to do this without doing it one letter a time. At each recursion, check if you can remove an entire word at a time off the front of the string, and then just keep going. In an initial test or two, it appears to run a good bit faster.

I think this is the first time I've used the count argument to str.replace.

def word_break(word_list, text):
text = text.replace(' ', '')

if text == '':
return True

return any(
text.startswith(word)
and word_break(word_list, text.replace(word, '', 1))
for word in word_list
)

If you're using Python 3.9+, you can replace text.replace(word, '', 1) with text.removeprefix(word).

I believe it's the same asymptotic complexity, but with a smaller constant (unless the words in your allowed list are all single characters, anyway).

check whether a string contains two or more words from my list (python)

1) Split the string.

2) Check occurrence of keyword in text.

3) If count greater than or equal to 2 print the text

keywords = ['the', 'apple' , 'fruit']
text = ['apple is a fruit', 'orange is fruit', 'the apple', 'the orange', 'the orange fruit']

for element in text:
if len(set(keywords)&set(element.split())) >=2 :
print element

Output:

apple is a fruit
the apple
the orange fruit


Related Topics



Leave a reply



Submit