Finding All Indexes of a Specified Character Within a String

Finding all indexes of a specified character within a string

A simple loop works well:

var str = "scissors";
var indices = [];
for(var i=0; i<str.length;i++) {
if (str[i] === "s") indices.push(i);
}

Now, you indicate that you want 1,4,5,8. This will give you 0, 3, 4, 7 since indexes are zero-based. So you could add one:

if (str[i] === "s") indices.push(i+1);

and now it will give you your expected result.

A fiddle can be see here.

I don't think looping through the whole is terribly efficient

As far as performance goes, I don't think this is something that you need to be gravely worried about until you start hitting problems.

Here is a jsPerf test comparing various answers. In Safari 5.1, the IndexOf performs the best. In Chrome 19, the for loop is the fastest.

Sample Image

Indexes of all occurrences of character in a string

This should print the list of positions without the -1 at the end that Peter Lawrey's solution has had.

int index = word.indexOf(guess);
while (index >= 0) {
System.out.println(index);
index = word.indexOf(guess, index + 1);
}

It can also be done as a for loop:

for (int index = word.indexOf(guess);
index >= 0;
index = word.indexOf(guess, index + 1))
{
System.out.println(index);
}

[Note: if guess can be longer than a single character, then it is possible, by analyzing the guess string, to loop through word faster than the above loops do. The benchmark for such an approach is the Boyer-Moore algorithm. However, the conditions that would favor using such an approach do not seem to be present.]

Get all occurrences of a substring in a very big string

Easy solution:

var str = "...";
var searchKeyword = "...";

var startingIndices = [];

var indexOccurence = str.indexOf(searchKeyword, 0);

while(indexOccurence >= 0) {
startingIndices.push(indexOccurence);

indexOccurence = str.indexOf(searchKeyword, indexOccurence + 1);
}

If you need something highly performant, you may look over specific text search/indexing algorithms like Aho–Corasick algorithm or Boyer–Moore string-search algorithm.

Really depends on your use case and if the text you're searching into is changing or is static and can be indexed beforehand for maximum performance.

How to find indices of all occurrences of one string in another in JavaScript?

var str = "I learned to play the Ukulele in Lebanon."
var regex = /le/gi, result, indices = [];
while ( (result = regex.exec(str)) ) {
indices.push(result.index);
}

UPDATE

I failed to spot in the original question that the search string needs to be a variable. I've written another version to deal with this case that uses indexOf, so you're back to where you started. As pointed out by Wrikken in the comments, to do this for the general case with regular expressions you would need to escape special regex characters, at which point I think the regex solution becomes more of a headache than it's worth.

function getIndicesOf(searchStr, str, caseSensitive) {    var searchStrLen = searchStr.length;    if (searchStrLen == 0) {        return [];    }    var startIndex = 0, index, indices = [];    if (!caseSensitive) {        str = str.toLowerCase();        searchStr = searchStr.toLowerCase();    }    while ((index = str.indexOf(searchStr, startIndex)) > -1) {        indices.push(index);        startIndex = index + searchStrLen;    }    return indices;}
var indices = getIndicesOf("le", "I learned to play the Ukulele in Lebanon.");
document.getElementById("output").innerHTML = indices + "";
<div id="output"></div>

How to find all occurrences of a substring?

There is no simple built-in string function that does what you're looking for, but you could use the more powerful regular expressions:

import re
[m.start() for m in re.finditer('test', 'test test test test')]
#[0, 5, 10, 15]

If you want to find overlapping matches, lookahead will do that:

[m.start() for m in re.finditer('(?=tt)', 'ttt')]
#[0, 1]

If you want a reverse find-all without overlaps, you can combine positive and negative lookahead into an expression like this:

search = 'tt'
[m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]
#[1]

re.finditer returns a generator, so you could change the [] in the above to () to get a generator instead of a list which will be more efficient if you're only iterating through the results once.

Find and print the indexes of a substring in string that starts and ends with a specific character in python

Can you modify the above code to just check for the beginning and end each time you loop. I also added a not index1 and not index2 because you need to address what happens if find returns None.

start = 0
while start < len(seq):
index1 = seq.find('atgca', start)
if index1 == -1 or not index1:
break
start = index1
index2 = seq.find('atgca',start)
if not index2:
break
print index1, index2
start = index2 + 1

Now that you have the two indexes you could even print the portion of the string that lies between if you wanted.

Finding the Index of a character within a string

Try the string.find method.

s = "mouse"
animal_letter = s.find('s')
print animal_letter

It returns the 0-based index (0 is the first character of the string) or -1 if the pattern is not found.

>>> "hello".find('h')
0
>>> "hello".find('o')
4
>>> "hello".find("darkside")
-1


Related Topics



Leave a reply



Submit