Find the indexes of all regex matches?
This is what you want: (source)
re.finditer(pattern, string[, flags])
Return an iterator yielding MatchObject instances over all
non-overlapping matches for the RE pattern in string. The string is
scanned left-to-right, and matches are returned in the order found. Empty
matches are included in the result unless they touch the beginning of
another match.
You can then get the start and end positions from the MatchObjects.
e.g.
[(m.start(0), m.end(0)) for m in re.finditer(pattern, string)]
Return positions of a regex match() in Javascript?
Here's what I came up with:
// Finds starting and ending positions of quoted text// in double or single quotes with escape char support like \" \'var str = "this is a \"quoted\" string as you can 'read'";
var patt = /'((?:\\.|[^'])*)'|"((?:\\.|[^"])*)"/igm;
while (match = patt.exec(str)) { console.log(match.index + ' ' + patt.lastIndex);}
How to find indexes of all non-matching characters with a JS regex?
Reason
The infinite loop is easy to explain: the regex has a g
modifier and thus tries to match multiple occurrences of the pattern starting each matching attempt after the end of the previous successful match, that is, after the lastIndex
value:
See exec
documentation:
If your regular expression uses the "
g
" flag, you can use theexec()
method multiple times to find successive matches in the same string. When you do so, the search starts at the substring ofstr
specified by the regular expression'slastIndex
property
However, since your pattern matches an empty string, and you do not check the condition if the index is equal to lastIndex
, the regex cannot advance in a string.
Solution
Use a regex to match any non-alphanumeric chars, /[\W_]/g
. Since it does not match empty strings the lastIndex
property of the RegExp object will be changed with each match and no infinite loop will occur.
JS demo:
let match, indexes = [];let reg = /[\W_]/g;let str = "1111-253-asdasdas";
while (match = reg.exec(str)) { indexes.push(match.index);}console.log(indexes);
Find all matching regex patterns and index of the match in the string
exec()
only returns a single match. To get all matches with a g
lobal regexp, you have to call it repeatedly, eg.:
var match, indexes= [];
while (match= r.exec(value))
indexes.push([match.index, match.index+match[0].length]);
Python Regex - How to Get Positions and Values of Matches
import re
p = re.compile("[a-z]")
for m in p.finditer('a1b2c3d4'):
print(m.start(), m.group())
Python - Locating the position of a regex match in a string?
You could use .find("is")
, it would return position of "is" in the string
or use .start() from re
>>> re.search("is", String).start()
2
Actually its match "is" from "This"
If you need to match per word, you should use \b
before and after "is", \b
is the word boundary.
>>> re.search(r"\bis\b", String).start()
5
>>>
for more info about python regular expressions, docs here
Is there a function that returns index where RegEx match starts?
For multiple matches you can use code similar to this:
Regex rx = new Regex("as");
foreach (Match match in rx.Matches("as as as as"))
{
int i = match.Index;
}
Related Topics
When to Use Sys.Path.Append and When Modifying %Pythonpath% Is Enough
How to Format Axis Number Format to Thousands with a Comma in Matplotlib
Removing the Tk Icon on a Tkinter Window
Activating Anaconda Environment in VScode
Fitting a Normal Distribution to 1D Data
Pip Install Gives Error: Unable to Find Vcvarsall.Bat
How to Read Contents of an Table in Ms-Word File Using Python
Namespaces with Module Imports
Access Class Variable from Instance
Python: How to Remove Empty Lists from a List
Using Psycopg2 with Lambda to Update Redshift (Python)
How to Annotate Types of Multiple Return Values
Removing Control Characters from a String in Python
How to Assign a Variable in an If Condition, and Then Return It
Python - When to Use File VS Open
Ioerror: [Errno 22] Invalid Mode ('R') or Filename: 'C:\\Python27\Test.Txt'