Re.sub not working for me
You are assigning the result of re.sub
back to a variable, right? e.g.
lines = re.sub(pattern, key[1], lines)
It's a string, so it can't be changed (strings are immutable in Python), therefore a new string is created and returned to you. If you don't assign it back to a name, you will lose it.
re.sub Not working for me with python
You have to re-assign it back to page
:
page = re.sub("&",'',page)
re.sub not replacing the string
The symbols [
and ]
means something in regular expressions, you have to escape them:
>>> re.sub('as_Points\[0\]\.ub_X', '0x00', text)
'AFL_v_CalcOneIntAreas (%0x00%);\n'
[a-z]
represents all the lower letters for instance. [...]
are used to denote «anything in them» so [01]
is for 0 or 1.
In your case 'as_Points[0].ub_X'
is in fact 'as_Points0.ub_X'
.
Note that the .
has special meanings too. It means 1 character. You should also escape it too.
If you don't know if your expression contains characters you should escape, you can use re.escape:
>>> someExpression = "as_Points[0].ub_X"
>>> re.escape(someExpression)
'as\\_Points\\[0\\]\\.ub\\_X'
>>> re.sub(re.escape(someExpression), '0x00', text)
'AFL_v_CalcOneIntAreas (%0x00%);\n'
But if you don't need regular expression power, strings have the replace method:
text.replace('as_Points[0].ub_X','0x00')
Python re.sub not working as expected
You're only stripping off spaces following <br>
with that. You can instead use a positive lookahead to remove all <br>
s that have another <br>
immediately following:
re.sub(r'<br>(?=<br>)', '', _str)
You may handle inter <br>
spaces with:
re.sub(r'<br>(?=\s*<br>)', '', _str)
Why Does re.sub() Not Work in Python 3.6?
I would replace all this with a single call to str.translate
, since you are only making single-character-to-single-character replacements.
You'll just need to define a single dict
(that you can reused for every call to str.translate
) that maps each character to its replacement. Characters that stay the same do not need to be added to the mapping.
replacements = {}
replacements.update(dict.fromkeys(range(0x2000, 0x2070), " "))
replacements[0x1680] = ' '
# etc
string = string.translate(replacements)
You can also use str.maketrans
to construct an appropriate translation table from a char-to-char mapping.
Why re.sub() adds not matched string by default in Python?
You seem to have a misunderstanding of what sub does. it substitutes the matching regex. this regex r'(size:)\D+(\d+)\D+(\d+)\D+(\d+)'
matches part of your string and so ONLY THE MATCHING PART will be substituted, the capture groups do not effect this.
what you can do (if you don't want to add .*
in the beginning and the end is to use re.findall
like this
re.findall(
r'(size:)\D+(\d+)\D+(\d+)\D+(\d+)',
'START, size: 100Х200 x 50, END'
)
which will return [('size:', '100', '200', '50')]
, you can then format it as you wish.
one way to do is as one liner with no error handling is like this:
'{1}x{2}x{3}'.format(
*re.findall(
r'(size:)\D+(\d+)\D+(\d+)\D+(\d+)',
'START, size: 100Х200 x 50, END')[0]
)
Python re.sub() is not replacing every match
The site explains it well, hover and use the explanation section.
(.)(.*?)\1
Does not remove or match every double occurance. It matches 1 character, followed by anything in the middle sandwiched till that same character is encountered again.
so, for abbcabb
the "sandwiched" portion should be bbc
between two a
EDIT:
You can try something like this instead without regexes:
string = "abbcabb"
result = []
for i in string:
if i not in result:
result.append(i)
else:
result.remove(i)
print(''.join(result))
Note that this produces the "last" odd occurrence of a string and not first.
For "first" known occurance, you should use a counter as suggested in this answer . Just change the condition to check for odd counts. pseudo code(count[letter] %2 == 1)
Re.sub in python not working
While the other answer is technically absolutely correct, I don't think you want that what is mentionned there.
Instead, you might want to work with a match object:
m = re.search(r'href="([\w:/.]+)"', s, re.I)
print m.expand(r"url: \1")
which results to
url: http://google.com
without the <A
before and the ID="test">blah</A>
behind.
(If you want to do more of these replacements, you might even want to reuse the regex by compiling it:
r = re.compile(r'href="([\w:/.]+)"', re.I)
ex = lambda st: r.search(st).expand(r"url: \1")
print ex('<A HREF="http://www.google.com" ID="test">blah</A>')
print ex('<A HREF="http://www.yahoo.com" ID="test">blah</A>')
# and so on.
If, however, you indeed want to keep the HTML around it, you'll have to work with lookahead and lookbehind expressions:
re.sub(r'(?<=href=")([\w:/.]+)(?=")', "url: " + r'\1', s, flags=re.I)
# -> '<A HREF="url: http://www.google.com" ID="test">blah</A>'
or simply by repeating the omitted stuff:
re.sub(r'href="([\w:/.]+)"', r'href="url: \1"', s, flags=re.I)
# -> '<A href="url: http://www.google.com" ID="test">blah</A>'
python re.sub not replacing all the occurance of string
I would use re.findall
here, rather than trying to do a replacement to remove the portions you don't want:
src = "http://www.google.com/#image-1CCCC| http://www.google.com/#image-1VVDD| http://www.google.com/#image-123| http://www.google.com/#image-123| http://www.google.com/#image-1CE005XG03"
matches = re.findall(r'https?://www\.\S+#([^|\s]+)', src)
output = '|'.join(matches)
print(output) # image-1CCCC|image-1VVDD|image-123|image-123|image-1CE005XG03
Note that if you want to be more specific and match only Google URLs, you may use the following pattern instead:
https?://www\.google\.\S+#([^|\s]+)
why does re.sub replaces none of the occurrences even there is already pattern, repl and string added
You need to assign the output of the re.sub
back to the original variable.
data = re.sub(r"\b{}\b".format(oldstring), newstring, data)
Related Topics
Same Output in Different Workers in Multiprocessing
Calling Functions from a Tkinter Frame to Another
Get Class Labels from Keras Functional Model
What's the Best Way to Duplicate Fork() in Windows
"Inner Exception" (With Traceback) in Python
How to Get a List of Column Names in SQLite
How to Melt 2 Columns at the Same Time
Customising Code of Qt Designer Widget
Can Multiprocessing Process Class Be Run from Idle
Using a Where _ in _ Statement
How to Combine Python Asyncio with Threads
I am Sending Commands Through Serial Port in Python But They Are Sent Multiple Times Instead of One
How to Get Stable Results with Tensorflow, Setting Random Seed
Why Do I Get "Pickle - Eoferror: Ran Out of Input" Reading an Empty File