Regex to remove commas before a number in python
Use a zero-width negative lookahead to make sure the to be replaced substrings (commas here) are not followed by {space(s)}{digit}
at the end:
,(?!\s+\d$)
Example:
In [227]: text = '52A, XYZ Street, ABC District, 2'
In [228]: re.sub(',(?!\s+\d$)', '', text)
Out[228]: '52A XYZ Street ABC District, 2'
Edit:
If you have more commas after the ,{space(s)}{digit}
substring, and want to keep them all, leverage a negative lookbehind to make sure the commas are not preceded by {space}{digit<or>[A-Z]}
:
(?<!\s[\dA-Z]),(?!\s+\d,?)
Example:
In [229]: text = '52A, XYZ Street, ABC District, 2, M, Brown'
In [230]: re.sub('(?<!\s[\dA-Z]),(?!\s+\d,?)', '', text)
Out[230]: '52A XYZ Street ABC District, 2, M, Brown'
In [231]: text = '52A, XYZ Street, ABC District, 2'
In [232]: re.sub('(?<!\s[\dA-Z]),(?!\s+\d,?)', '', text)
Out[232]: '52A XYZ Street ABC District, 2'
Remove comma only from number separators (regular expression grouping)
Using @uingtea regex, but for pandas dataframe
, you can do in this way:
import pandas as pd
import re
df = pd.DataFrame({'col':['Hello, world!', 'Warhammer 40,000', 'Codename 1,337']})
df['col'] = df['col'].apply(lambda x: re.sub(r'(\d+),(\d+)', r'\1\2', x))
regex in Python to remove commas and spaces
you can use the split to create an array and filter len < 1 array
import re
s='word1 , word2 , word3, '
r=re.split("[^a-zA-Z\d]+",s)
ans=','.join([ i for i in r if len(i) > 0 ])
Removing the . (full-stop) and , (commas) that occurs in-between numbers in python
You can use this pattern
(\d)[,.](\d)
Replace by \1\2
Regex Demo
If there are numbers with multiple . or ,
you can use lookaround
(?<=\d)[,.](?=\d)
(?<=\d)
- Match must be preceded by digit characters[,.]
- Match, or .
(?=\d)
- Match must be followed by digit
Replace by empty string
Regex Demo
Regex to remove commas from numbers under 10,000
To match numbers under 10,000, you could match a single digit before the comma instead of 2, and match 1-3 digits after the comma to also match 1,9 for example.
To prevent a partial match, you could assert whitespace boundaries.
(?<!\S)(?<d1>\d),(?<d2>\d{1,3})(?!\S)
Regex demo
How can I remove commas while using regex.findall?
Try using the following regex pattern:
Balance: (\d{1,3}(?:,\d{3})*)
This will match only a comma-separated balance amount, and will not pick up on anything else. Sample script:
txt = "Balance: 47,124, age, ... Balance: 1,234, age ... Balance: 123, age"
amounts = re.findall(r'Balance: (\d{1,3}(?:,\d{3})*)', txt)
amounts = [a.replace(',', '') for a in amounts]
print(amounts)
['47124', '1234', '123']
Here is how the regex pattern works:
\d{1,3} match an initial 1 to 3 digits
(?:,\d{3})* followed by `(,ddd)` zero or more times
So the pattern matches 1 to 999, and then allows these same values followed by one or more comma-separated thousands group.
Cleaning up commas in numbers w/ regular expressions in Python
I think what you're looking for is, assuming that commas will only appear in numbers, and that those entries will always be quoted:
import re
def remove_commas(mystring):
return re.sub(r'"(\d+?),(\d+?)"', r'\1\2', mystring)
UPDATE:
Adding cdarke's comments below, the following should work for arbitrary-length numbers:
import re
def remove_commas_and_quotes(mystring):
return re.sub(r'","|",|"', ',', re.sub(r'(?:(\d+?),)',r'\1',mystring))
replace a comma only if is between two numbers
You can use regex look around to restrict the comma (?<=\d),(?=\d)
; use ?<=
for look behind and ?=
for look ahead; They are zero length assertions and don't consume characters so the pattern in the look around will not be removed:
import re
re.sub('(?<=\d),(?=\d)', '', '123,123 hello,word')
# '123123 hello,word'
How to find commas amongst any letters in strings and remove commas using regex?
fileinput = open('INFILE.txt', 'r')
fileoutput = fileinput.read()
#fileinput.close()
replace = re.sub(r'([A-Za-z]),([A-Za-z])', r'\1', fileoutput)
print replace
replaceout = open('OUTFILE.txt', 'w')
replaceout.write(replace)
#CHANGE TO r'\1' OR r'\1\2' DEPENDING HOW MANY COMMAS YOU SEE AMONG LETTERS
#WORKING CODE TO READ FILE FINDING AND DELETING COMMAS AMONG LETTERS AND
#WRITING TO NEW FILE
Remove decimal points and commas using regex in python
^\d
matches everything that is not a digit.
Instead, you should use (?<=\d)[,\.]
.
(?<=\d)
ensures that there are digits before the comma or the point.
import re
st = '19.000\n20,000\na.a,a'
print(re.sub(r'(?<=\d)[,\.]','',st))
>> 19000
20000
a.a,a
Related Topics
How to Kill a While Loop With a Keystroke
How to Map the Differences Between Two Strings
Check Json Data Is None in Python
Python Tkinter Return Value from Function Used in Command
Update Json Element in Json Object Using Python
How to Count the Amount of Sentences in a Paragraph in Python
How to Get Slope from Timeseries Data in Pandas
How to Get Text from Span Tag in Beautifulsoup
Invalidargumenterror: Logits and Labels Must Have the Same First Dimension Seq2Seq Tensorflow
Combine Date and Time Columns Using Python Pandas
How to Remove Any Url Within a String in Python
Converting Pandas Column of Comma-Separated Strings into Integers
How to Display Last 2 Digits from a Number in Python
Pyspark - Sum a Column in Dataframe and Return Results as Int
Python: Plotting Percentage in Seaborn Bar Plot
How to Convert Strings With Billion or Million Abbreviation into Integers in a List