Counting Repeated Characters in a String in Python

Counting repeated characters in a string in Python

My first idea was to do this:

chars = "abcdefghijklmnopqrstuvwxyz"
check_string = "i am checking this string to see how many times each character appears"

for char in chars:
count = check_string.count(char)
if count > 1:
print char, count

This is not a good idea, however! This is going to scan the string 26 times, so you're going to potentially do 26 times more work than some of the other answers. You really should do this:

count = {}
for s in check_string:
if s in count:
count[s] += 1
else:
count[s] = 1

for key in count:
if count[key] > 1:
print key, count[key]

This ensures that you only go through the string once, instead of 26 times.

Also, Alex's answer is a great one - I was not familiar with the collections module. I'll be using that in the future. His answer is more concise than mine is and technically superior. I recommend using his code over mine.

Counting repeated characters in a string in a row Python

One general solution might be to use re.findall with the pattern ((\S)\2{3,}):

myString = "I contain foooour O's in a row without any space"
matches = re.findall(r'((\S)\2{3,})', myString)
print(matches[0][0])

This prints:

oooo

how to count repeated characters in a string in python

If you want to fix your function, here is fixed variant:

def encode(message):
result = []
i = count = 0
while i < len(message) - 1:
count = 1
while i + count < len(message) and message[i + count - 1] == message[i + count]:
count += 1
i += count
result.append("{}{}".format(count, message[i - 1]))
if count == 1:
result.append("1" + message[-1])
return result

What's changed:

  1. for loop replaced with while. Why? Cause you need to jump over indexes incide loop. range(0,len(message)-1,1) returns you list [0, 1, 2, ...] and it doesn't matter what you do with char variable incide loop, it won't affect next iteration. To have a possibility skip some indexes I used while loop with predefined ( i = count = 0 ) index and count variables.
  2. Changed conditions of internal while loop. Now there're two conditions:

    • message[i + count - 1] == message[i + count] - check if next symbol same with current;
    • i + count < len(message) - prevent intenal loop from accessing index out of range.
  3. Updating "main" index ( i ) outside of internal loop.
  4. if count == 1: added post condition after loop execution to not miss last character in case if it's single.

Find no of repeated characters in a string using one for loop with no variables

this one follows both the rules

x='ABCDEAB'
for i in x:
try:
if(i in x[x.index(i)+1:]):
print(i,end=" ")
x=x.replace(i,"",1)
except ValueError:
pass

how to count repeated characters in text file using python

Move your return to be outside of the for loop. It's currently only going through 1 iteration.

Count repeated characters in a string

Using rle and rawConversion functions:

d <- data.frame(col1 = c("apples333", "summer13", "talk77", "Aa6668"))

foo <- function(x, p){
r <- rle(charToRaw(tolower(x)))
res <- max(r$lengths[ grepl(p, rawToChar(r$values, multiple = TRUE)) ])
if(res == 1) res <- 0
res
}

d$repLetter <- sapply(d$col1, foo, p = "[a-z]")
d$repNumber <- sapply(d$col1, foo, p = "[0-9]")

d
# col1 repLetter repNumber
# 1 apples333 2 3
# 2 summer13 2 0
# 3 talk77 0 2
# 4 Aa6668 2 3

Count the number of occurrences of a character in a string

str.count(sub[, start[, end]])

Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.

>>> sentence = 'Mary had a little lamb'
>>> sentence.count('a')
4

Counting occurrence of all characters in string but only once if character is repeated

Use collections.Counter:

>>> from collections import Counter
>>> Counter('aaabbbccc')
Counter({'a': 3, 'b': 3, 'c': 3})

You can get the counts as a sorted list easily by iterating the counter with string.ascii_lowercase:

>>> import string
>>> c = Counter('aaabbbccc')
>>> [c[l] for l in string.ascii_lowercase]
[3, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]


Related Topics



Leave a reply



Submit