Find the Indexes of All Regex Matches

Find the indexes of all regex matches?

This is what you want: (source)

re.finditer(pattern, string[, flags]) 

Return an iterator yielding MatchObject instances over all
non-overlapping matches for the RE pattern in string. The string is
scanned left-to-right, and matches are returned in the order found. Empty
matches are included in the result unless they touch the beginning of
another match.

You can then get the start and end positions from the MatchObjects.

e.g.

[(m.start(0), m.end(0)) for m in re.finditer(pattern, string)]

Return positions of a regex match() in Javascript?

Here's what I came up with:

// Finds starting and ending positions of quoted text// in double or single quotes with escape char support like \" \'var str = "this is a \"quoted\" string as you can 'read'";
var patt = /'((?:\\.|[^'])*)'|"((?:\\.|[^"])*)"/igm;
while (match = patt.exec(str)) { console.log(match.index + ' ' + patt.lastIndex);}

How to find indexes of all non-matching characters with a JS regex?

Reason

The infinite loop is easy to explain: the regex has a g modifier and thus tries to match multiple occurrences of the pattern starting each matching attempt after the end of the previous successful match, that is, after the lastIndex value:

See exec documentation:

If your regular expression uses the "g" flag, you can use the exec() method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property

However, since your pattern matches an empty string, and you do not check the condition if the index is equal to lastIndex, the regex cannot advance in a string.

Solution

Use a regex to match any non-alphanumeric chars, /[\W_]/g. Since it does not match empty strings the lastIndex property of the RegExp object will be changed with each match and no infinite loop will occur.

JS demo:

let match, indexes = [];let reg = /[\W_]/g;let str = "1111-253-asdasdas";
while (match = reg.exec(str)) { indexes.push(match.index);}console.log(indexes);

Find all matching regex patterns and index of the match in the string

exec() only returns a single match. To get all matches with a g​lobal regexp, you have to call it repeatedly, eg.:

var match, indexes= [];
while (match= r.exec(value))
indexes.push([match.index, match.index+match[0].length]);

Python Regex - How to Get Positions and Values of Matches

import re
p = re.compile("[a-z]")
for m in p.finditer('a1b2c3d4'):
print(m.start(), m.group())

Python - Locating the position of a regex match in a string?

You could use .find("is"), it would return position of "is" in the string

or use .start() from re

>>> re.search("is", String).start()
2

Actually its match "is" from "This"

If you need to match per word, you should use \b before and after "is", \b is the word boundary.

>>> re.search(r"\bis\b", String).start()
5
>>>

for more info about python regular expressions, docs here

Is there a function that returns index where RegEx match starts?

For multiple matches you can use code similar to this:

Regex rx = new Regex("as");
foreach (Match match in rx.Matches("as as as as"))
{
int i = match.Index;
}


Related Topics



Leave a reply



Submit