Why Are Str.Count('') and Len(Str) Giving Different Output

Why are str.count('') and len(str) giving different output?

str.count() counts non-overlapping occurrences of the substring:

Return the number of non-overlapping occurrences of substring sub.

There is exactly one such place where the substring '' occurs in the string '': right at the start. So the count should return 1.

Generally speaking, the empty string will match at all positions in a given string, including right at the start and end, so the count should always be the length plus 1:

>>> (' ' * 100).count('')
101

That's because empty strings are considered to exist between all the characters of a string; for a string length 2, there are 3 empty strings; one at the start, one between the two characters, and one at the end.

So yes, the results are different and they are entirely correct.

Count of is length of string + 1?

According to this other question Why are str.count('') and len(str) giving different output?, it appears that a python string consists of an empty string, an empty between each character, and an empty afterwards. So hi is really ''h''i''

How does the count() method work?

'' doesn't mean any string, it means no string (aka, the empty string, or the 0-length string). There are, strictly speaking, an infinite number of 0-length strings in a string, but practically, len(string) + 1 is returned - one for just before first character, and one each for after every character.

This scenario has been explicitly special-cased in count.h:

if (sub_len == 0)
return (str_len < maxcount) ? str_len + 1 : maxcount;

When the search string is the empty string, len(string) + 1 is returned by default.

Find string between two substrings

import re

s = 'asdf=5;iwantthis123jasd'
result = re.search('asdf=5;(.*)123jasd', s)
print(result.group(1))

How does count work when passing an empty string?

The reason it prints 12 is that there are empty strings in between every letter, and at both sides. Here is a diagram:

  All empty strings!
^h^e^l^l^o^ ^w^o^r^l^d^

it looks weird, but every ^ is an empty sn empty string, and if you count them, there are 12.

The reason you are getting the error is that a string is just an array of characters, so it is zero-indexed, meaning that the first element is index 0, the second at index 1, and so on. Here is a diagram:

-------------------------------------
a | b | c | d | e | f | g | h | i | j
-------------------------------------
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
-------------------------------------

As you can see, the tenth element (j), is at index 9, so trying to get index 10 would result in an error.

How do I get the number of elements in a list (length of a list) in Python?

The len() function can be used with several different types in Python - both built-in types and library types. For example:

>>> len([1, 2, 3])
3


Related Topics



Leave a reply



Submit