Get the Index of a Pattern in a String Using Regex

Get the index of a pattern in a string using regex

Use Matcher:

public static void printMatches(String text, String regex) {
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(text);
// Check all occurrences
while (matcher.find()) {
System.out.print("Start index: " + matcher.start());
System.out.print(" End index: " + matcher.end());
System.out.println(" Found: " + matcher.group());
}
}

Get the index of a pattern in a string using regex with Kotlin

The MatchResult object has the range property:

The range of indices in the original string where match was captured.

Also, MatchGroup has a range property, too.

A short demo showing the range of the first match of a whole word long:

val strs = listOf("Long days become shorter","winter lasts longer")
val pattern = """(?i)\blong\b""".toRegex()
strs.forEach { str ->
val found = pattern.find(str)
if (found != null) {
val m = found?.value
val idx = found?.range
println("'$m' found at indexes $idx in '$str'")
}
}
// => 'Long' found at indexes 0..3 in 'long days become shorter'

Is there a version of JavaScript's String.indexOf() that allows for regular expressions?

Combining a few of the approaches already mentioned (the indexOf is obviously rather simple), I think these are the functions that will do the trick:

function regexIndexOf(string, regex, startpos) {
var indexOf = string.substring(startpos || 0).search(regex);
return (indexOf >= 0) ? (indexOf + (startpos || 0)) : indexOf;
}

function regexLastIndexOf(string, regex, startpos) {
regex = (regex.global) ? regex : new RegExp(regex.source, "g" + (regex.ignoreCase ? "i" : "") + (regex.multiLine ? "m" : ""));
if(typeof (startpos) == "undefined") {
startpos = string.length;
} else if(startpos < 0) {
startpos = 0;
}
var stringToWorkWith = string.substring(0, startpos + 1);
var lastIndexOf = -1;
var nextStop = 0;
while((result = regex.exec(stringToWorkWith)) != null) {
lastIndexOf = result.index;
regex.lastIndex = ++nextStop;
}
return lastIndexOf;
}

UPDATE: Edited regexLastIndexOf() so that is seems to mimic lastIndexOf() now. Please let me know if it still fails and under what circumstances.


UPDATE: Passes all tests found on in comments on this page, and my own. Of course, that doesn't mean it's bulletproof. Any feedback appreciated.

Find all pattern indexes in string in C#

 string pattern = "##";
string sentence = "45##78$$#56$$J##K01UU";
IList<int> indeces = new List<int>();
foreach (Match match in Regex.Matches(sentence, pattern))
{
indeces.Add(match.Index);
}

indeces will have 2, 14

How can I find start and end index of a substring matched by regex in Java?

Matcher matcher = pattern.matcher(string)
if (matcher.find()) {
start = matcher.start()
end = matcher.end()
text = matcher.group()
}

Java: Find index of first Regex

As requested a more complete solution:

    /** @return index of pattern in s or -1, if not found */
public static int indexOf(Pattern pattern, String s) {
Matcher matcher = pattern.matcher(s);
return matcher.find() ? matcher.start() : -1;
}

call:

int index = indexOf(Pattern.compile("(?<!a)bc"), "abc xbc");

How to find indexes of all non-matching characters with a JS regex?

Reason

The infinite loop is easy to explain: the regex has a g modifier and thus tries to match multiple occurrences of the pattern starting each matching attempt after the end of the previous successful match, that is, after the lastIndex value:

See exec documentation:

If your regular expression uses the "g" flag, you can use the exec() method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property

However, since your pattern matches an empty string, and you do not check the condition if the index is equal to lastIndex, the regex cannot advance in a string.

Solution

Use a regex to match any non-alphanumeric chars, /[\W_]/g. Since it does not match empty strings the lastIndex property of the RegExp object will be changed with each match and no infinite loop will occur.

JS demo:

let match, indexes = [];let reg = /[\W_]/g;let str = "1111-253-asdasdas";
while (match = reg.exec(str)) { indexes.push(match.index);}console.log(indexes);

Find the indexes of all regex matches?

This is what you want: (source)

re.finditer(pattern, string[, flags]) 

Return an iterator yielding MatchObject instances over all
non-overlapping matches for the RE pattern in string. The string is
scanned left-to-right, and matches are returned in the order found. Empty
matches are included in the result unless they touch the beginning of
another match.

You can then get the start and end positions from the MatchObjects.

e.g.

[(m.start(0), m.end(0)) for m in re.finditer(pattern, string)]

Get the index of a regex match in Scala

Match class contains the properties describing a particular regex match, including the position, at which it starts.

Something like "foo".r.findFirstMatchIn(bar).map(_.start) should do what you ask.

But if you are really just looking for a substring, then bar.indexOf("foo") will be a lot faster.



Related Topics



Leave a reply



Submit