Reversing a Regular Expression in Python

how to reverse a regex in python?

A good way is (which will be faster than using a lambda in the sorted):

sorted(re.finditer(...,text),key=attrgetter('group'),reverse=True):

Or you could turn the iterator into a list and reverse it:

for i in reversed(list(re.finditer('id (.+?) result (.+)', text))): 

How do I reverse regex substitution?

It seems that this works:

import re 

tu = ('This is my first regex python example '
'yahooa yahoouuee bbbiirdd',

'bbbiirdd',

'fookirooksooktook',

'crrsciencezxxxxxscienceokjjsciencq')

reg = re.compile(r'([bcdfghj-np-tv-z])(\1?)')
dereg = re.compile('science([^aeiou])|([^aeiou])ook')

def Frepl(ma):
g1,g2 = ma.groups()
if g2: return 'science' + g2
else: return g1 + 'ook'

def Fderepl(ma):
g = ma.group(2)
if g: return g
else: return 2*ma.group(1)

for strt in tu:
resu = reg.sub(Frepl , strt)
bakk = dereg.sub(Fderepl, resu)
print ('----------------------------------\n'
'strt = %s\n' 'resu == %s\n'
'bakk == %s\n' 'bakk == start : %s'
% (strt, resu, bakk, bakk==strt))

Edit

First, I updated the above code: I eliminated the re.I flag. It was capturing portions like 'dD' as a repeated letter. so it was transformed to 'scienceD', then back to 'DD'

Secondly, I extended the code with a dictionary.

Instead of replacing a letter with letter+'ook', it replaces according to the letter.

For example, I choosed to replace 'b' with 'BAR', 'c' with 'CORE'.... I put the values of the dictionary uppercased, to have a better view of the result. It may in fact be anything else.

The programs takes care of the case. I put only 'T','Y','X' in the dictionary, it's just for essay.

import re 

d = {'b':'BAR','c':'CORE','d':'DEAD','f':'FAN',
'g':'GO','h':'HHH','j':'JIU','k':'KOAN',
'l':'LOW','m':'MY','n':'NERD','p':'PI',
'q':'QIM','r':'ROAR','s':'SING','t':'TIP',
'v':'VIEW','w':'WAVE','x':'XOR',
'y':'YEAR','z':'ZOO',
'T':'tears','Y':'yearling','X':'xylophone'}

ded = dict((v,k) for k,v in d.iteritems())
print ded

tu = ('This is my first regex python example '
'Yahooa yahoouuee bbbiirdd',

'bbbiirdd',

'fookirooksooktook',

'crrsciencezxxxxxXscienceokjjsciencq')

reg = re.compile(r'([bcdfghj-np-tv-zBCDFGHJ-NP-TV-Z])(\1?)')

othergr = '|'.join(ded.keys())
dereg = re.compile('science([^aeiouAEIOU])|(%s)' % othergr)

def Frepl(ma, d=d):
g1,g2 = ma.groups()
if g2: return 'science' + g2
else: return d[g1]

def Fderepl(ma,ded=ded):
g = ma.group(2)
if g: return ded[g]
else: return 2*ma.group(1)

for strt in tu:
resu = reg.sub(Frepl , strt)
bakk = dereg.sub(Fderepl, resu)
print ('----------------------------------\n'
'strt = %s\n' 'resu == %s\n'
'bakk == %s\n' 'bakk == start : %s'
% (strt, resu, bakk, bakk==strt))

result

----------------------------------
strt = This is my first regex python example Yahooa yahoouuee bbbiirdd
resu == tearsHHHiSING iSING MYYEAR FANiROARSINGTIP ROAReGOeXOR PIYEARTIPHHHoNERD eXORaMYPILOWe yearlingaHHHooa YEARaHHHoouuee sciencebBARiiROARscienced
bakk == This is my first regex python example Yahooa yahoouuee bbbiirdd
bakk == start : True
----------------------------------
strt = bbbiirdd
resu == sciencebBARiiROARscienced
bakk == bbbiirdd
bakk == start : True
----------------------------------
strt = fookirooksooktook
resu == FANooKOANiROARooKOANSINGooKOANTIPooKOAN
bakk == fookirooksooktook
bakk == start : True
----------------------------------
strt = crrsciencezxxxxxXscienceokjjsciencq
resu == COREsciencerSINGCOREieNERDCOREeZOOsciencexsciencexXORxylophoneSINGCOREieNERDCOREeoKOANsciencejSINGCOREieNERDCOREQIM
bakk == crrsciencezxxxxxXscienceokjjsciencq
bakk == start : True

Regular expression to reverse order of words in a string

You need re.sub() for this as:

>>> a="The big, fast bug ate the slower one. The quick, brown fox jumps over the lazy dog"
>>> re.sub(r'\s(\w*),\s+(\w*)\s',r' \2, \1 ',a)
'The fast, big bug ate the slower one. The brown, quick fox jumps over the lazy dog'

It only substitutes words separated by ',' with same words in reverse order, leaving rest of the string as it is.

Reversing Python's re.escape

So is this really the only thing that works?

>>> re.sub(r'\\(.)', r'\1', re.escape(' '))
' '

Yes. The source for the re module contains no unescape() function, so you're definitely going to have to write one yourself.

Furthermore, the re.escape() function uses str.translate()

def escape(pattern):
"""
Escape special characters in a string.
"""
if isinstance(pattern, str):
return pattern.translate(_special_chars_map)
else:
pattern = str(pattern, 'latin1')
return pattern.translate(_special_chars_map).encode('latin1')

… which, while it can transform a single character into multiple characters (e.g. [\[), cannot perform the reverse of that operation.

Since there's no direct reversal of escape() available via str.translate(), a custom unescape() function using re.sub(), as described in your question, is the most straightforward solution.

How to reverse punctuation marks using regular expression in python?

Try:

file_content = re.sub(ur'^(.*)(؟)$', r'$2$1', file_content, flags=re.MULTILINE)


Related Topics



Leave a reply



Submit