Finding Occurrences of a Word in a String in Python 3

Finding occurrences of a word in a string in python 3

If you're going for efficiency:

import re
count = sum(1 for _ in re.finditer(r'\b%s\b' % re.escape(word), input_string))

This doesn't need to create any intermediate lists (unlike split()) and thus will work efficiently for large input_string values.

It also has the benefit of working correctly with punctuation - it will properly return 1 as the count for the phrase "Mike saw a dog." (whereas an argumentless split() would not). It uses the \b regex flag, which matches on word boundaries (transitions between \w a.k.a [a-zA-Z0-9_] and anything else).

If you need to worry about languages beyond the ASCII character set, you may need to adjust the regex to properly match non-word characters in those languages, but for many applications this would be an overcomplication, and in many other cases setting the unicode and/or locale flags for the regex would suffice.

Finding all occurrences of a word in a string in Python3

You can use re.findall and search for hell with zero or more word characters on either side:

>>> import re
>>> s = 'heller pond hell hellyi'
>>> re.findall('\w*hell\w*', s)
['heller', 'hell', 'hellyi']
>>>

Count the number of occurrences of a character in a string

str.count(sub[, start[, end]])

Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.

>>> sentence = 'Mary had a little lamb'
>>> sentence.count('a')
4

How to find all occurrences of a substring?

There is no simple built-in string function that does what you're looking for, but you could use the more powerful regular expressions:

import re
[m.start() for m in re.finditer('test', 'test test test test')]
#[0, 5, 10, 15]

If you want to find overlapping matches, lookahead will do that:

[m.start() for m in re.finditer('(?=tt)', 'ttt')]
#[0, 1]

If you want a reverse find-all without overlaps, you can combine positive and negative lookahead into an expression like this:

search = 'tt'
[m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]
#[1]

re.finditer returns a generator, so you could change the [] in the above to () to get a generator instead of a list which will be more efficient if you're only iterating through the results once.

Check if a word is in a string in Python

What is wrong with:

if word in mystring: 
print('success')

How to count exact words in Python

that list syntax is off, heres a way to do it though

bad_chars = [';', ':', '!', "*","?","."]
res = {}
for word in ["it","fit"]:
res[word] = 0
string = ''.join((filter(lambda i: i not in bad_chars, "does it fit?")))
for i in string.split(" "):
if word == i: res[word] += 1

print(res)

by using the in keyword you were checking if that string was in another string, in this case it was inside fit, so you were getting 2 occurrences of it

here it directly compares the words after removing punctuation/special characters!

output:

{'it': 1, 'fit': 1}

How to count all occurrences of a word in a string using python

Based on documentation, str.count() return the number of non-overlapping occurrences of substring sub in the range [start, end]. You can use a positive lookahead based regular expression in order to find the overlapped strings:

>>> import re
>>> s = 'abdebobdfhbobob'
>>> len(re.findall(r'(?=bob)', s))
3

If you don't want to use regex you can use a generator expression within the sum() function that will iterate over the all sub-strings with length 3 and count the number of those that are equal to 'bob':

>>> sum(s[i:i+3] == 'bob' for i in range(len(s)-2))
3

How to find the count of a word in a string

If you want to find the count of an individual word, just use count:

input_string.count("Hello")

Use collections.Counter and split() to tally up all the words:

from collections import Counter

words = input_string.split()
wordCount = Counter(words)


Related Topics



Leave a reply



Submit