Finding occurrences of a word in a string in python 3
If you're going for efficiency:
import re
count = sum(1 for _ in re.finditer(r'\b%s\b' % re.escape(word), input_string))
This doesn't need to create any intermediate lists (unlike split()
) and thus will work efficiently for large input_string
values.
It also has the benefit of working correctly with punctuation - it will properly return 1
as the count for the phrase "Mike saw a dog."
(whereas an argumentless split()
would not). It uses the \b
regex flag, which matches on word boundaries (transitions between \w
a.k.a [a-zA-Z0-9_]
and anything else).
If you need to worry about languages beyond the ASCII character set, you may need to adjust the regex to properly match non-word characters in those languages, but for many applications this would be an overcomplication, and in many other cases setting the unicode and/or locale flags for the regex would suffice.
Finding all occurrences of a word in a string in Python3
You can use re.findall
and search for hell
with zero or more word characters on either side:
>>> import re
>>> s = 'heller pond hell hellyi'
>>> re.findall('\w*hell\w*', s)
['heller', 'hell', 'hellyi']
>>>
Count the number of occurrences of a character in a string
str.count(sub[, start[, end]])
Return the number of non-overlapping occurrences of substring
sub
in the range[start, end]
. Optional argumentsstart
andend
are interpreted as in slice notation.
>>> sentence = 'Mary had a little lamb'
>>> sentence.count('a')
4
How to find all occurrences of a substring?
There is no simple built-in string function that does what you're looking for, but you could use the more powerful regular expressions:
import re
[m.start() for m in re.finditer('test', 'test test test test')]
#[0, 5, 10, 15]
If you want to find overlapping matches, lookahead will do that:
[m.start() for m in re.finditer('(?=tt)', 'ttt')]
#[0, 1]
If you want a reverse find-all without overlaps, you can combine positive and negative lookahead into an expression like this:
search = 'tt'
[m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]
#[1]
re.finditer
returns a generator, so you could change the []
in the above to ()
to get a generator instead of a list which will be more efficient if you're only iterating through the results once.
Check if a word is in a string in Python
What is wrong with:
if word in mystring:
print('success')
How to count exact words in Python
that list syntax is off, heres a way to do it though
bad_chars = [';', ':', '!', "*","?","."]
res = {}
for word in ["it","fit"]:
res[word] = 0
string = ''.join((filter(lambda i: i not in bad_chars, "does it fit?")))
for i in string.split(" "):
if word == i: res[word] += 1
print(res)
by using the in
keyword you were checking if that string was in another string, in this case it
was inside fit
, so you were getting 2 occurrences of it
here it directly compares the words after removing punctuation/special characters!
output:
{'it': 1, 'fit': 1}
How to count all occurrences of a word in a string using python
Based on documentation, str.count()
return the number of non-overlapping occurrences of substring sub in the range [start, end]
. You can use a positive lookahead based regular expression in order to find the overlapped strings:
>>> import re
>>> s = 'abdebobdfhbobob'
>>> len(re.findall(r'(?=bob)', s))
3
If you don't want to use regex you can use a generator expression within the sum()
function that will iterate over the all sub-strings with length 3 and count the number of those that are equal to 'bob':
>>> sum(s[i:i+3] == 'bob' for i in range(len(s)-2))
3
How to find the count of a word in a string
If you want to find the count of an individual word, just use count
:
input_string.count("Hello")
Use collections.Counter
and split()
to tally up all the words:
from collections import Counter
words = input_string.split()
wordCount = Counter(words)
Related Topics
How to Use a Multiprocessing Queue in a Function Called by Pool.Imap
How to Handle an Asymptote/Discontinuity with Matplotlib
Printing a List of Objects of User Defined Class
Pyplot Move Alternative Y Axis to Background
Generating Discrete Random Variables with Specified Weights Using Scipy or Numpy
Pygame Tic Tak Toe Logic? How Would I Do It
Concat Dataframe Reindexing Only Valid with Uniquely Valued Index Objects
Site Matching Query Does Not Exist
How to Efficiently Process a Numpy Array in Blocks Similar to Matlab's Blkproc (Blockproc) Function
How to Return a String from a Regex Match in Python
Good or Bad Practice in Python: Import in the Middle of a File
Cleanest Way to Hide Every Nth Tick Label in Matplotlib Colorbar
How to Resize an Image with Opencv2.0 and Python2.6
Filtering a List Based on a List of Booleans
Python Decorator Handling Docstrings
What Is the Purpose of Subclassing the Class "Object" in Python