Check list of words in another string
if any(word in 'some one long two phrase three' for word in list_):
Python: how to determine if a list of words exist in a string
This function was found by Peter Gibson (below) to be the most performant of the answers here. It is good for datasets one may hold in memory (because it creates a list of words from the string to be searched and then a set of those words):
def words_in_string(word_list, a_string):
return set(word_list).intersection(a_string.split())
Usage:
my_word_list = ['one', 'two', 'three']
a_string = 'one two three'
if words_in_string(my_word_list, a_string):
print('One or more words found!')
Which prints One or words found!
to stdout.
It does return the actual words found:
for word in words_in_string(my_word_list, a_string):
print(word)
Prints out:
three
two
one
For data so large you can't hold it in memory, the solution given in this answer would be very performant.
Check if multiple strings exist in another string
You can use any
:
a_string = "A string is more than its parts!"
matches = ["more", "wholesome", "milk"]
if any(x in a_string for x in matches):
Similarly to check if all the strings from the list are found, use all
instead of any
.
How to check if a word or group of words exist in given list of strings and how to extract that word?
You could use re.findall
so there's no nested loop.
output = {}
find_words = re.compile('|'.join(list_of_words)).findall
for i, (s,) in enumerate(map(dict.values, data['dict_sentences']), 1):
words = find_words(s.lower())
if words:
output[f"sent{i}"] = words
{'sent1': ['emmanuel college', 'churchill college'],
'sent2': ['emmanuel college'],
'sent3': ['holy trinity church']}
This can be done in a dict_comprehension as well using the walrus operator in python 3.8+ although may be a little overboard:
find_sent = re.compile('|'.join(list_of_words)).findall
iter_sent = enumerate(map(dict.values, data['dict_sentences']), 1)
output = {f"sent{i}": words for i, (s,) in iter_sent if (words := find_sent(s.lower()))}
How to check if a string contains an element from a list in Python
Use a generator together with any
, which short-circuits on the first True:
if any(ext in url_string for ext in extensionsToCheck):
print(url_string)
EDIT: I see this answer has been accepted by OP. Though my solution may be "good enough" solution to his particular problem, and is a good general way to check if any strings in a list are found in another string, keep in mind that this is all that this solution does. It does not care WHERE the string is found e.g. in the ending of the string. If this is important, as is often the case with urls, you should look to the answer of @Wladimir Palant, or you risk getting false positives.
Check wether words from a list are inside a string of another list Python
In general, I recommend putting more thought into naming variables. I like how you tried to print the story headings. The line if any(lijst in s for s in a)
does not do what you think it should: you need to instead iterate over each word in a single h2. The any
function is just a short hand for the following:
def any(iterable):
for element in iterable:
if element:
return True
return False
In other words, you're trying to see if an entire list is in an h2 element, which will never be true. Here is an example fix.
import requests
from bs4 import BeautifulSoup
url = 'https://www.nytimes.com'
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
h2s = soup.findAll("h2", class_="esl82me0")
for story_heading in h2s:
print(story_heading.contents[0])
keywords = ["trump", "Trump", "Corona", "COVID", "virus", "Virus", "Coronavirus", "COVID-19"]
number = 0
run = 0
for h2 in h2s:
headline = h2.text
words_in_headline = headline.split(" ")
for word in words_in_headline:
if word in keywords:
number += 1
print("\nTrump or the Corona virus have been mentioned", number, "times.")
Output
Trump or the Corona virus have been mentioned 7 times.
Python - Search for a match for a word in a list of words in a string
In python you can easily check if a string contains another by using the in
operator.
Just check the string for each keyword, and remember to make the case the same.
You can use one line
[print(x) for x in keywords if x in text.upper()]
or multiple
for x in keywords:
if x in text.upper():
print(x)
In your case the following example will output PYTHON
:
text = "The best language in the word is $python at now"
keywords = ["PYTHON","PHP","JAVA","COBOL","CPP","VB","HTML"]
[print(x) for x in keywords if x in text.upper()] #PYTHON
Have a nice day.
edit
As Malo indicated, i might be better style to pass the output to a variable and then print it after.
text = "The best language in the word is $python at now"
keywords = ["PYTHON","PHP","JAVA","COBOL","CPP","VB","HTML"]
matches = [x for x in keywords if x in text.upper()]
for x in matches: print(x) # PYTHON
Check if a word is in a string in Python
What is wrong with:
if word in mystring:
print('success')
Check if string is valid based on list of words
I didn't see it at first, but there's a very similar way to do this without doing it one letter a time. At each recursion, check if you can remove an entire word at a time off the front of the string, and then just keep going. In an initial test or two, it appears to run a good bit faster.
I think this is the first time I've used the count
argument to str.replace
.
def word_break(word_list, text):
text = text.replace(' ', '')
if text == '':
return True
return any(
text.startswith(word)
and word_break(word_list, text.replace(word, '', 1))
for word in word_list
)
If you're using Python 3.9+, you can replace text.replace(word, '', 1)
with text.removeprefix(word)
.
I believe it's the same asymptotic complexity, but with a smaller constant (unless the words in your allowed list are all single characters, anyway).
check whether a string contains two or more words from my list (python)
1) Split the string.
2) Check occurrence of keyword in text.
3) If count greater than or equal to 2 print the text
keywords = ['the', 'apple' , 'fruit']
text = ['apple is a fruit', 'orange is fruit', 'the apple', 'the orange', 'the orange fruit']
for element in text:
if len(set(keywords)&set(element.split())) >=2 :
print element
Output:
apple is a fruit
the apple
the orange fruit
Related Topics
Meaning of Inter_Op_Parallelism_Threads and Intra_Op_Parallelism_Threads
What Is the _Dict_._Dict_ Attribute of a Python Class
Create a Day-Of-Week Column in a Pandas Dataframe Using Python
How to Set Default Python Version to Python3 in Ubuntu
Running Get_Dummies on Several Dataframe Columns
Inverse Distance Weighted (Idw) Interpolation with Python
Zlib.Error: Error -3 While Decompressing: Incorrect Header Check
Split List into Smaller Lists (Split in Half)
Read from a Log File as It's Being Written Using Python
Why Sum on Lists Is (Sometimes) Faster Than Itertools.Chain
Add Column to Dataframe with Constant Value
Underscore VS Double Underscore with Variables and Methods
Pandas Datetime to Unix Timestamp Seconds
Python Pandas Extract Year from Datetime: Df['Year'] = Df['Date'].Year Is Not Working