Regex to Remove Commas Before a Number in Python

Regex to remove commas before a number in python

Use a zero-width negative lookahead to make sure the to be replaced substrings (commas here) are not followed by {space(s)}{digit} at the end:

,(?!\s+\d$)

Example:

In [227]: text = '52A, XYZ Street, ABC District, 2'

In [228]: re.sub(',(?!\s+\d$)', '', text)
Out[228]: '52A XYZ Street ABC District, 2'

Edit:

If you have more commas after the ,{space(s)}{digit} substring, and want to keep them all, leverage a negative lookbehind to make sure the commas are not preceded by {space}{digit<or>[A-Z]}:

(?<!\s[\dA-Z]),(?!\s+\d,?)

Example:

In [229]: text = '52A, XYZ Street, ABC District, 2, M, Brown'

In [230]: re.sub('(?<!\s[\dA-Z]),(?!\s+\d,?)', '', text)
Out[230]: '52A XYZ Street ABC District, 2, M, Brown'

In [231]: text = '52A, XYZ Street, ABC District, 2'

In [232]: re.sub('(?<!\s[\dA-Z]),(?!\s+\d,?)', '', text)
Out[232]: '52A XYZ Street ABC District, 2'

Remove comma only from number separators (regular expression grouping)

Using @uingtea regex, but for pandas dataframe, you can do in this way:

import pandas as pd
import re

df = pd.DataFrame({'col':['Hello, world!', 'Warhammer 40,000', 'Codename 1,337']})
df['col'] = df['col'].apply(lambda x: re.sub(r'(\d+),(\d+)', r'\1\2', x))

regex in Python to remove commas and spaces

you can use the split to create an array and filter len < 1 array

import re
s='word1 , word2 , word3, '
r=re.split("[^a-zA-Z\d]+",s)
ans=','.join([ i for i in r if len(i) > 0 ])

Removing the . (full-stop) and , (commas) that occurs in-between numbers in python

You can use this pattern

(\d)[,.](\d)

enter image description here

Replace by \1\2

Regex Demo

If there are numbers with multiple . or , you can use lookaround

(?<=\d)[,.](?=\d)
  • (?<=\d) - Match must be preceded by digit characters
  • [,.] - Match , or .
  • (?=\d) - Match must be followed by digit

Replace by empty string

Regex Demo

Regex to remove commas from numbers under 10,000

To match numbers under 10,000, you could match a single digit before the comma instead of 2, and match 1-3 digits after the comma to also match 1,9 for example.

To prevent a partial match, you could assert whitespace boundaries.

(?<!\S)(?<d1>\d),(?<d2>\d{1,3})(?!\S)

Regex demo

How can I remove commas while using regex.findall?

Try using the following regex pattern:

Balance: (\d{1,3}(?:,\d{3})*)

This will match only a comma-separated balance amount, and will not pick up on anything else. Sample script:

txt = "Balance: 47,124, age, ... Balance: 1,234, age ... Balance: 123, age"
amounts = re.findall(r'Balance: (\d{1,3}(?:,\d{3})*)', txt)
amounts = [a.replace(',', '') for a in amounts]
print(amounts)

['47124', '1234', '123']

Here is how the regex pattern works:

\d{1,3}      match an initial 1 to 3 digits
(?:,\d{3})* followed by `(,ddd)` zero or more times

So the pattern matches 1 to 999, and then allows these same values followed by one or more comma-separated thousands group.

Cleaning up commas in numbers w/ regular expressions in Python

I think what you're looking for is, assuming that commas will only appear in numbers, and that those entries will always be quoted:

import re

def remove_commas(mystring):
return re.sub(r'"(\d+?),(\d+?)"', r'\1\2', mystring)

UPDATE:
Adding cdarke's comments below, the following should work for arbitrary-length numbers:

import re

def remove_commas_and_quotes(mystring):
return re.sub(r'","|",|"', ',', re.sub(r'(?:(\d+?),)',r'\1',mystring))

replace a comma only if is between two numbers

You can use regex look around to restrict the comma (?<=\d),(?=\d); use ?<= for look behind and ?= for look ahead; They are zero length assertions and don't consume characters so the pattern in the look around will not be removed:

import re

re.sub('(?<=\d),(?=\d)', '', '123,123 hello,word')
# '123123 hello,word'

How to find commas amongst any letters in strings and remove commas using regex?

fileinput = open('INFILE.txt', 'r')
fileoutput = fileinput.read()
#fileinput.close()
replace = re.sub(r'([A-Za-z]),([A-Za-z])', r'\1', fileoutput)
print replace
replaceout = open('OUTFILE.txt', 'w')
replaceout.write(replace)

#CHANGE TO r'\1' OR r'\1\2' DEPENDING HOW MANY COMMAS YOU SEE AMONG LETTERS
#WORKING CODE TO READ FILE FINDING AND DELETING COMMAS AMONG LETTERS AND
#WRITING TO NEW FILE

Remove decimal points and commas using regex in python

^\d matches everything that is not a digit.

Instead, you should use (?<=\d)[,\.].

(?<=\d) ensures that there are digits before the comma or the point.

import re

st = '19.000\n20,000\na.a,a'

print(re.sub(r'(?<=\d)[,\.]','',st))
>> 19000
20000
a.a,a


Related Topics



Leave a reply



Submit