Regular Expression to Find Money Value

Regular expression to find money value

You could use a positive lookahead (?= and a word boundary after VND \b.

-?\d+(?:\.\d+)?(?=VND\b)

Regex demo

That would match

  • -? Optional minus sign (To also allow a plus, you could use an optional character class [+-]?
  • \d+ Match one or more digits
  • (?:\.\d+)? An optional non capturing group matching a dot and one or more digits
  • (?=VND\b) Positive lookahead that asserts what is on the right is VND

In Java:

-?\\d+(?:\\.\\d+)?(?=VND\\b)

Demo Java

Regex to match dollar sign, money, decimals only

Just replace the space within your negated-character class with closed bracket:

In [37]: x = re.findall(r"\$[^\]]+", y)

In [38]: x
Out[38]: ['$1.19', '$5.29']

A regex to get any price string

If you want to match all currency symbols before a number with the number itself, you may combine the two expressions:

  • Currency symbol regex: \b(?:[BS]/\.|R(?:D?\$|p))| \b(?:[TN]T|[CJZ])\$|Дин\.|\b(?:Bs|Ft|Gs|K[Mč]|Lek|B[Zr]|k[nr]|[PQLSR]|лв|ден|RM|MT|lei|zł|USD|GBP|EUR|JPY|CHF|SEK|DKK|NOK|SGD|HKD|AUD|TWD|NZD|CNY|KRW|INR|CAD|VEF|EGP|THB|IDR|PKR|MYR|PHP|MXN|VND|CZK|HUF|PLN|TRY|ZAR|ILS|ARS|CLP|BRL|RUB|QAR|AED|COP|PEN|CNH|KWD|SAR)\b|\$[Ub]|[\p{Sc}ƒ]
  • Number regex: (?<!\d)(?<!\d\.)(?:\d{1,3}(?:,\d{3})*|\d+)(?:\.\d{1,2})?(?!\.?\d)

Currencies are taken from World Currency Symbols, the 3-letter currency codes used in the pattern are the most commonly used ones, but the comprehensive list can also be compiled using those data.

The answer is

(?:\b(?:[BS]/\.|R(?:D?\$|p))|\b(?:[TN]T|[CJZ])\$|Дин\.|\b(?:Bs|Ft|Gs|K[Mč]|Lek|B[Zr]|k[nr]|[PQLSR]|лв|ден|RM|MT|lei|zł|USD|GBP|EUR|JPY|CHF|SEK|DKK|NOK|SGD|HKD|AUD|TWD|NZD|CNY|KRW|INR|CAD|VEF|EGP|THB|IDR|PKR|MYR|PHP|MXN|VND|CZK|HUF|PLN|TRY|ZAR|ILS|ARS|CLP|BRL|RUB|QAR|AED|COP|PEN|CNH|KWD|SAR)|\$[Ub]|[\p{Sc}ƒ])\s?(?:\d{1,3}(?:,\d{3})*|\d+)(?:\.\d{1,2})?(?!\.?\d)

See the regex demo

It is created like this: (?:CUR_SYM_REGEX)\s?NUM_REGEX, with the lookbehinds in number regex stripped from the pattern since the left-hand context is already defined.

How to extract money value and currency from narratives which have different format and type of currency?

strapply extracts the matches to the capture groups (i.e. parenthesized portions of pattern) of the pattern pat from the character strings word (first argument) and inputs the capture groups as separate arguments to the function (third argument -- the function may be expressed in formula notation with the body of the function on the right hand side of the tilde). It returns the output of the function.

library(gsubfn)

pat <- "(USD|GBP|EUR|\\$) *([0-9.]+)"
currency <- strapply(words, pat, ~ sub("\\$", "USD", ..1), simplify = TRUE)
value <- strapply(words, pat, ~ as.numeric(..2), simplify = TRUE)

Regex for currency value

Your question has three parts, and to me it sounds like it is mostly about "learning how to fish", which is great.

**A. The Regex You Want **

Based on the comments, you are looking for this (see demo):

^R\$\d+(?:\.\d{3})*,\d{2}$

B. Explanation of the Regex

This is a relatively simple regex, and for this you can read an automatically-generated explanation. Several sites do this. Here is one (it will display better on the original site).

NODE                     EXPLANATION
-------------------------------------------------------------------------------- \w word characters (a-z, A-Z, 0-9, _)
-------------------------------------------------------------------------------- \^ '^'
-------------------------------------------------------------------------------- \$ '$'
-------------------------------------------------------------------------------- ( group and capture to \1:
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to \2 (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\, ','
--------------------------------------------------------------------------------
\d{3} digits (0-9) (3 times)
--------------------------------------------------------------------------------
)* end of \2 (NOTE: because you are using a
quantifier on this capture, only the
LAST repetition of the captured pattern
will be stored in \2)
-------------------------------------------------------------------------------- | OR
--------------------------------------------------------------------------------
( group and capture to \3:
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
) end of \3
-------------------------------------------------------------------------------- ) end of \1
-------------------------------------------------------------------------------- ( group and capture to \4 (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d{2} digits (0-9) (2 times)
-------------------------------------------------------------------------------- )? end of \4 (NOTE: because you are using a
quantifier on this capture, only the LAST
repetition of the captured pattern will be
stored in \4)
-------------------------------------------------------------------------------- $ before an optional \n, and the end of the
string

C. How Can I Learn to Build a Regex Like That

Here are the resources I recommend.

  1. Books: Mastering Regular Expressions (3rd Ed), the Regex Cookbook

  2. Websites: regular-expressions.info, RexEgg, FAQ on SO

  3. Tools: RegexBuddy (commercial but outstanding debugger), regex101.com, Debuggex.com

Regular expression to validate money with minimum value

Reading the comments, I'm not sure if regex would be your way forward, yet you seem determined. It seems that you are looking to validate a comma-seperated string that needs to start at 20,000, where each second part of the number is 3 digits long. I came up with:

^(?:[2-9]\d|[1-9]\d\d|[1-9],\d{3})(?:,\d{3})+$

See the online demo



  • ^ - Start string ancor.
  • (?: - Open 1st non-capture group.
    • [2-9]\d - A digit ranging from 2-9 followed by any digit.
    • | - Or.
    • [1-9]\d\d - A digit ranging from 1-9 followed by any two digits.
    • | - Or.
    • [1-9],\d{3} - A digit ranging from 1-9 followed by a comma and any three digits.
    • ) - Close 1st non-capture group.
  • (?: - Open 2nd non-capture group.
    • ,\d{3} - A comma followed by any three digits.
    • )+ - Close 2nd non-capture group and repeat at least once.
  • $ - End string ancor.

As an alternative you could also use lookaheads, e.g.:

^(?=.{6,})(?!1.{5}$)[1-9]\d?\d?(?:,\d{3})+$

See the online demo



  • ^ - Start string ancor.
  • (?=.{6,} - Positive lookahead for at 6 or more characters.
  • (?!1.{5}$) - Negative lookahead for 1 followed by 5 characters till end string.
  • [1-9]\d?\d? - A digit ranging from 1-9 followed by two optional digits (you could also write [1-9]\d{0,2}).
  • (?: - Open 2nd non-capture group.
    • ,\d{3} - A comma followed by any three digits.
    • )+ - Close non-capture group and repeat at least once.
  • $ - End string ancor.

Regular Expression for Currency

If you want to disallow 0.00 value, and allow numbers without a digit grouping symbol, you can use

 /^(?!0+\.0+$)\d{1,3}(?:,\d{3})*\.\d{2}$/.test(your_str)

See the regex demo

Explanation:

  • ^ - start of string
  • (?!0+\.0+$) - negative lookahead that fails the match if the input is zero
  • \d{1,3} - 1 to 3 digits
  • (?:,\d{3})* - 0+ sequences of a comma followed with 3 digits
  • \. - a literal dot
  • \d{2} - 2 digits (decimal part)
  • $ - end of string.

document.body.innerHTML = /^(?!0+\.0+$)\d{1,3}(?:,\d{3}|\d)*\.\d{2}$/.test("1,150.25");document.body.innerHTML += "<br/>" + /^(?!0+\.0+$)\d{1,3}(?:,\d{3}|\d)*\.\d{2}$/.test("0.25");
document.body.innerHTML += "<br/>" + /^(?!0+\.0+$)\d{1,3}(?:,\d{3})*\.\d{2}$/.test("25");document.body.innerHTML += "<br/>" + /^(?!0+\.0+$)\d{1,3}(?:,\d{3})*\.\d{2}$/.test("0.00");document.body.innerHTML += "<br/>" + /^(?!0+\.0+$)\d{1,3}(?:,\d{3})*\.\d{2}$/.test("1150.25");

regex for money values in JavaScript

This should work:

isValid = str.search(/^\$?[\d,]+(\.\d*)?$/) >= 0;

A little more strict with comma placement (would reject 3,2.10, for example):

isValid = str.search(/^\$?\d+(,\d{3})*(\.\d*)?$/) >= 0;

To get a number out of it:

if(isValid) {
var num = Number(str.replace(/[\$,]/g, ''));
...
}


Related Topics



Leave a reply



Submit