Python Regex - How to Get Positions and Values of Matches

Python Regex - How to Get Positions and Values of Matches

import re
p = re.compile("[a-z]")
for m in p.finditer('a1b2c3d4'):
print(m.start(), m.group())

Python - Locating the position of a regex match in a string?

You could use .find("is"), it would return position of "is" in the string

or use .start() from re

>>> re.search("is", String).start()
2

Actually its match "is" from "This"

If you need to match per word, you should use \b before and after "is", \b is the word boundary.

>>> re.search(r"\bis\b", String).start()
5
>>>

for more info about python regular expressions, docs here

Python Regex Finding a Match That Starts Inside Previous match

Since you have overlapping matches, you need to use a capturing group inside a lookahead as: (?=(YOUEXPR))

import re

s= 'GATATATGCATATACTT'
t = r'(?=(ATAT))'

pattern = re.compile(t)

[print(i) for i in pattern.finditer(s)]

Output:

<re.Match object; span=(1, 1), match=''>
<re.Match object; span=(3, 3), match=''>
<re.Match object; span=(9, 9), match=''>

Or:

[print(i.start()) for i in pattern.finditer(s)]

Output:

1
3
9

Or:

import re

s= 'GATATATGCATATACTT'
t = 'ATAT'

pattern = re.compile(f'(?=({t}))')

print ([(i.start(), s[i.start():i.start()+len(t)]) for i in pattern.finditer(s)])

Output:

[(1, 'ATAT'), (3, 'ATAT'), (9, 'ATAT')]

Python re find index position of first search match

print theYear.search(myString).span()

Pythonic way to find the last position in a string matching a negative regex

To me it sems that you just want the last position which matches a given pattern (in this case the not a number pattern).

This is as pythonic as it gets:

import re

string = 'uiae1iuae200'
pattern = r'[^0-9]'

match = re.match(fr'.*({pattern})', string)
print(match.end(1) - 1 if match else None)

Output:

8

 

Or the exact same as a function and with more test cases:

import re


def last_match(pattern, string):
match = re.match(fr'.*({pattern})', string)
return match.end(1) - 1 if match else None


cases = [(r'[^0-9]', 'uiae1iuae200'), (r'[^0-9]', '123a'), (r'[^0-9]', '123'), (r'[^abc]', 'abcabc1abc'), (r'[^1]', '11eea11')]

for pattern, string in cases:
print(f'{pattern}, {string}: {last_match(pattern, string)}')

Output:

[^0-9], uiae1iuae200: 8
[^0-9], 123a: 3
[^0-9], 123: None
[^abc], abcabc1abc: 6
[^1], 11eea11: 4

How to get a list of character positions in Python?

Try:

text = 'abcdefa'
pattern = re.compile('a|c')
[(m.group(), m.start()) for m in pattern.finditer(text)]

How to get the position of the last match of a regex?

Why not just use findall?

s.rfind(re.findall(pattern, s)[-1])

Find the indexes of all regex matches?

This is what you want: (source)

re.finditer(pattern, string[, flags]) 

Return an iterator yielding MatchObject instances over all
non-overlapping matches for the RE pattern in string. The string is
scanned left-to-right, and matches are returned in the order found. Empty
matches are included in the result unless they touch the beginning of
another match.

You can then get the start and end positions from the MatchObjects.

e.g.

[(m.start(0), m.end(0)) for m in re.finditer(pattern, string)]

Regex - Find matches with the same characters at specific positions

You can try using the following regex pattern:

.[VIFY][MLFY].*

This will match any first character, followed by a second and third character using the logic you want.

import re
mylist = ['GQPLWLEH', 'TLYSFFPK', 'TYGEIFEK', 'APYWLINK']
r = re.compile(".[VIFY][MLFY].*")
newlist = filter(r.match, mylist)
print str(newlist)

Demo here:

Rextester

Note: I added the word BILL to your list in the demo to get something which passes the regex match.



Related Topics



Leave a reply



Submit