Count Number of Occurrences of a Substring in a String

Occurrences of substring in a string

The last line was creating a problem. lastIndex would never be at -1, so there would be an infinite loop. This can be fixed by moving the last line of code into the if block.

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while(lastIndex != -1){

lastIndex = str.indexOf(findStr,lastIndex);

if(lastIndex != -1){
count ++;
lastIndex += findStr.length();
}
}
System.out.println(count);

How to count number of occurrences of a substring inside a string in Python?

you can use count

print("hellohel".count("hel"))
2

If you want to count overlapping occurrences... maybe this can help

def countOverlapping(string, item):
count = 0
for i in range(0,len(string)):
if item in string[i:len(item)+i]:
count += 1
return count

print(countOverlapping("ehehe", "ehe"))

output should be...

2

How does that work?

as @SomeDude mentioned it uses what he calls a sliding window approach

we take the length of the substring and check if its in that "window" of the string each iteration:

is ehe in [ehe]he? yes, count += 1
is ehe in e[heh]e? no, pass
is ehe in eh[ehe]? yes, count += 1

Count occurrences of a substring in a list of strings

You can do this by using the sum built-in function. No need to use list.count as well:

>>> data = ["the foo is all fooed", "the bar is all barred", "foo is now a bar"]
>>> sum('foo' in s for s in data)
2
>>>

This code works because booleans can be treated as integers. Each time 'foo' appears in a string element, True is returned. the integer value of True is 1. So it's as if each time 'foo' is in a string, we return 1. Thus, summing the 1's returned will yield the number of times 1 appeared in an element.

A perhaps more explicit but equivalent way to write the above code would be:

>>> sum(1 for s in data if 'foo' in s)
2
>>>

Count the number of occurrences of a character in a string

str.count(sub[, start[, end]])

Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.

>>> sentence = 'Mary had a little lamb'
>>> sentence.count('a')
4

How to count of sub-string occurrences?

Regex.Matches(input, "OU=").Count

find the count of substring in string

You could do something like

int countString(const char *haystack, const char *needle){
int count = 0;
const char *tmp = haystack;
while(tmp = strstr(tmp, needle))
{
count++;
tmp++;
}
return count;
}

That is, when you get a result, start searching again at the next position of the string.

strstr() doesn't only work starting from the beginning of a string but from any position.

How to find the count of substring in java

What about:

String temp = s.replace(sub, "");
int occ = (s.length() - temp.length()) / sub.length();

Just remove all the substring, then check the difference on string length before and after removal. Divide the temp string with number of characters from the substring gives you the occurrences.

Count the number of occurrences of a substring in a string

Update given your comments below, if the white space is the same in both strings:

awk 'BEGIN{print gsub(ARGV[2],"",ARGV[1])}' "$STRING" "$SUB_STRING"

or if the white space is different as in your example where the STRING lines start with 9 blanks but SUB_STRING with 8:

$ awk 'BEGIN{gsub(/[[:space:]]+/,"[[:space:]]+",ARGV[2]); print gsub(ARGV[2],"",ARGV[1])}' "$STRING" "$SUB_STRING"

Original answer:

With GNU awk if your white-space matched between files and the search string doesn't contain RE metachars all you'd need is:

awk -v RS='^$' 'NR==FNR{str=$0; next} {print gsub(str,"")}' str file

or with any awk if your input also doesn't contain NUL chars:

awk -v RS='\0' 'NR==FNR{str=$0; next} {print gsub(str,"")}' str file

but for a full solution with explanations, read on:

With any POSIX awk in any shell on any UNIX box:

$ cat str
Bluetooth
Soft blocked: no
Hard blocked: no

$ awk '
NR==FNR { str=(str=="" ? "" : str ORS) $0; next }
{ rec=(rec=="" ? "" : rec ORS) $0 }
END {
gsub(/[^[:space:]]/,"[&]",str) # make sure each non-space char is treated as literal
gsub(/[[:space:]]+/,"[[:space:]]+",str) # make sure space differences do not matter
print gsub(str,"",rec)
}
' str file
2

With a non-POSIX awk like nawk just use 0-9 instead of [:space:]. If your search string can contain backslashes then we'd need to add 1 more gsub() to handle them.

Alternatively, with GNU awk for multi-char RS:

$ awk -v RS='^$' 'NR==FNR{gsub(/[^[:space:]]/,"[&]"); gsub(/[[:space:]]+/,"[[:space:]]+"); str=$0; next} {print gsub(str,"")}' str file
2

or with any awk if your input cannot contain NUL chars:

$ awk -v RS='\0' 'NR==FNR{gsub(/[^[:space:]]/,"[&]"); gsub(/[[:space:]]+/,"[[:space:]]+"); str=$0; next} {print gsub(str,"")}' str file
2

and on and on...



Related Topics



Leave a reply



Submit