Count Consecutive Characters

Count consecutive characters

A solution "that way", with only basic statements:

word="100011010" #word = "1"
count=1
length=""
if len(word)>1:
    for i in range(1,len(word)):
       if word[i-1]==word[i]:
          count+=1
       else :
           length += word[i-1]+" repeats "+str(count)+", "
           count=1
    length += ("and "+word[i]+" repeats "+str(count))
else:
    i=0
    length += ("and "+word[i]+" repeats "+str(count))
print (length)

Output :

'1 repeats 1, 0 repeats 3, 1 repeats 2, 0 repeats 1, 1 repeats 1, and 0 repeats 1'
#'1 repeats 1'

Find count of each consecutive characters

Regular expression to the rescue ?

var myString = "aaaabbccaa";

var pattern = @"(\w)\1*";
var regExp = new Regex(pattern);
var matches = regExp.Matches(myString);

var tab = matches.Select(x => String.Format("{0}{1}", x.Value.First(), x.Value.Length));
var result = String.Join("", tab);

How to count consecutive characters?

Solution if you are interested in the while loop mechanics :

l = 'aaaabbBBccaazzZZZzzzertTTyyzaaaAA'
output = ''

index = 0
while index < len(l):
    incr = index
    count = 1
    output += l[incr]
    while incr < len(l)-1 and l[incr]==l[incr+1]:
        count += 1
        incr += 1
        index += 1
    output += str(count)
    index += 1

print(output)

How to count instances of consecutive letters in a string in Python 3?

This is possible with itertools.groupby:

from itertools import groupby

x = 'EOOOEOEE'

res = sum(len(list(j)) > 1 for i, j in groupby(x) if i == 'O')  # 1

How can I quickly count the maximum number of consecutive single characters in a string?

Remarkable speed improvements can be made with a dynamic regex. We can use a variable to store the max length string, then search for a string that is that long, plus one or more. The theory being that we only need to look for strings longer than the one we already have.

I used a solution that looks like this

sub hack {
    my $match = "";                        # original search string
    while ($string =~ /(${match}1+)/g) {   # search for $match plus 1 or more 1s
        $match = $1;                       # when found, change to new match
    }
    length $match;                         # return max length
}

And compared it to the original method described by the OP, with the following result

use strict;
use warnings;
use Benchmark ':all';

my $string = '0100100101111011010010101101101110101011111111101010100100100001011101010100' x 10_000;

cmpthese(-1, {
    org  => sub { my $max = 0; while ($string =~ /(1+)/g) { my $len = length($1); if ($max < $len) { $max = $len } } },
    hack => sub { my $match = ""; while ($string =~ /(${match}1+)/g) { $match = $1; } length $match }
});

Output:

       Rate    org   hack
org  7.31/s     --   -99%
hack 1372/s 18669%     --

Which seems astonishingly high, 19000% faster. It makes me think I've made a mistake, but I can't think what that would be. Maybe I am missing something in the regex machine internals, but this would be quite the improvement on the original solution.

Code that takes a string and recognizes the number of consecutive letters

Here is one way. You only need a single loop. The inner loop does the work. The outer loop simply supplies test cases.

assign the first character
and set count to 1 for that character
then iterate until adjacent characters are different
append count if > 1 and append the different character
set count to 0 for next run.

String[] data = { "uuuuuuhhhaaajqqq", 
        "hhhttrew","abbcccddddeeeeeffffffggggggg" };

for (String s : data) {
    String result = "" + s.charAt(0);
    int count = 1;
    for (int i = 1; i < s.length(); i++) {
        if (s.charAt(i - 1) != s.charAt(i)) {
            result += count <= 1 ? "" : count;
            result += s.charAt(i);
            count = 0;
        }
        count++;
        if (i == s.length() - 1) {
            result += count <= 1 ? "" : count;
        }
    }
    System.out.printf("%-15s <-- %s%n", result, s);
}

prints

u6h3a3jq3       <-- uuuuuuhhhaaajqqq
h3t2rew         <-- hhhttrew
ab2c3d4e5f6g7   <-- abbcccddddeeeeeffffffggggggg

In a comment (now deleted) you had enquired how to reverse the process. This is one way to do it.

allocate a StringBuilder to hold the result.
initialize count and currentChar
as the string is processed,
- save a character to currentChar
- then while the next char(s) is a digit, build the count
if the count is still 0, then the next character was a digit so bump count by one and copy the currentChar to the buffer
otherwise, use the computed length.

String[] encoded =
        { "u6h3a3jq3", "h3t2rew", "ab2c3d4e5f6g7" };

for (String s : encoded) {
    
    StringBuilder sb = new StringBuilder();
    int count = 0;
    char currentChar = '\0';
    for (int i = 0; i < s.length();) {
        if (Character.isLetter(s.charAt(i))) {
            currentChar = s.charAt(i++);
        }
        while (i < s.length()
                && Character.isDigit(s.charAt(i))) {
            count = count * 10 + s.charAt(i++) - '0';
        }
        count = count == 0 ? 1 : count;
        sb.append(Character.toString(currentChar)
                .repeat(count));
        count = 0;
    }
    System.out.println(s + " --> " + sb);
}

prints

u6h3a3jq3 --> uuuuuuhhhaaajqqq
h3t2rew --> hhhttrew
ab2c3d4e5f6g7 --> abbcccddddeeeeeffffffggggggg

Python: Count the consecutive characters at the beginning of a string

If you strip the characters from the beginning, then you are left with a shorter string and can subtract its length from the original, giving you the number of characters removed.

return len(s) - len(s.lstrip(target))

Note: Your shown code will immediately return 0 if the first character does not match target. If you want to check if there is any repeated first character, you don't need to have target and can just use s[0]

Count Consecutive Characters