Regular expression to find money value
You could use a positive lookahead (?=
and a word boundary after VND \b
.
-?\d+(?:\.\d+)?(?=VND\b)
Regex demo
That would match
-?
Optional minus sign (To also allow a plus, you could use an optional character class[+-]?
\d+
Match one or more digits(?:\.\d+)?
An optional non capturing group matching a dot and one or more digits(?=VND\b)
Positive lookahead that asserts what is on the right isVND
In Java:
-?\\d+(?:\\.\\d+)?(?=VND\\b)
Demo Java
Regex to match dollar sign, money, decimals only
Just replace the space within your negated-character class with closed bracket:
In [37]: x = re.findall(r"\$[^\]]+", y)
In [38]: x
Out[38]: ['$1.19', '$5.29']
A regex to get any price string
If you want to match all currency symbols before a number with the number itself, you may combine the two expressions:
- Currency symbol regex:
\b(?:[BS]/\.|R(?:D?\$|p))| \b(?:[TN]T|[CJZ])\$|Дин\.|\b(?:Bs|Ft|Gs|K[Mč]|Lek|B[Zr]|k[nr]|[PQLSR]|лв|ден|RM|MT|lei|zł|USD|GBP|EUR|JPY|CHF|SEK|DKK|NOK|SGD|HKD|AUD|TWD|NZD|CNY|KRW|INR|CAD|VEF|EGP|THB|IDR|PKR|MYR|PHP|MXN|VND|CZK|HUF|PLN|TRY|ZAR|ILS|ARS|CLP|BRL|RUB|QAR|AED|COP|PEN|CNH|KWD|SAR)\b|\$[Ub]|[\p{Sc}ƒ]
- Number regex:
(?<!\d)(?<!\d\.)(?:\d{1,3}(?:,\d{3})*|\d+)(?:\.\d{1,2})?(?!\.?\d)
Currencies are taken from World Currency Symbols, the 3-letter currency codes used in the pattern are the most commonly used ones, but the comprehensive list can also be compiled using those data.
The answer is
(?:\b(?:[BS]/\.|R(?:D?\$|p))|\b(?:[TN]T|[CJZ])\$|Дин\.|\b(?:Bs|Ft|Gs|K[Mč]|Lek|B[Zr]|k[nr]|[PQLSR]|лв|ден|RM|MT|lei|zł|USD|GBP|EUR|JPY|CHF|SEK|DKK|NOK|SGD|HKD|AUD|TWD|NZD|CNY|KRW|INR|CAD|VEF|EGP|THB|IDR|PKR|MYR|PHP|MXN|VND|CZK|HUF|PLN|TRY|ZAR|ILS|ARS|CLP|BRL|RUB|QAR|AED|COP|PEN|CNH|KWD|SAR)|\$[Ub]|[\p{Sc}ƒ])\s?(?:\d{1,3}(?:,\d{3})*|\d+)(?:\.\d{1,2})?(?!\.?\d)
See the regex demo
It is created like this: (?:CUR_SYM_REGEX)\s?NUM_REGEX
, with the lookbehinds in number regex stripped from the pattern since the left-hand context is already defined.
How to extract money value and currency from narratives which have different format and type of currency?
strapply
extracts the matches to the capture groups (i.e. parenthesized portions of pattern) of the pattern pat
from the character strings word
(first argument) and inputs the capture groups as separate arguments to the function (third argument -- the function may be expressed in formula notation with the body of the function on the right hand side of the tilde). It returns the output of the function.
library(gsubfn)
pat <- "(USD|GBP|EUR|\\$) *([0-9.]+)"
currency <- strapply(words, pat, ~ sub("\\$", "USD", ..1), simplify = TRUE)
value <- strapply(words, pat, ~ as.numeric(..2), simplify = TRUE)
Regex for currency value
Your question has three parts, and to me it sounds like it is mostly about "learning how to fish", which is great.
**A. The Regex You Want **
Based on the comments, you are looking for this (see demo):
^R\$\d+(?:\.\d{3})*,\d{2}$
B. Explanation of the Regex
This is a relatively simple regex, and for this you can read an automatically-generated explanation. Several sites do this. Here is one (it will display better on the original site).
NODE EXPLANATION
-------------------------------------------------------------------------------- \w word characters (a-z, A-Z, 0-9, _)
-------------------------------------------------------------------------------- \^ '^'
-------------------------------------------------------------------------------- \$ '$'
-------------------------------------------------------------------------------- ( group and capture to \1:
--------------------------------------------------------------------------------
\d{1,3} digits (0-9) (between 1 and 3 times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to \2 (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\, ','
--------------------------------------------------------------------------------
\d{3} digits (0-9) (3 times)
--------------------------------------------------------------------------------
)* end of \2 (NOTE: because you are using a
quantifier on this capture, only the
LAST repetition of the captured pattern
will be stored in \2)
-------------------------------------------------------------------------------- | OR
--------------------------------------------------------------------------------
( group and capture to \3:
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
) end of \3
-------------------------------------------------------------------------------- ) end of \1
-------------------------------------------------------------------------------- ( group and capture to \4 (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d{2} digits (0-9) (2 times)
-------------------------------------------------------------------------------- )? end of \4 (NOTE: because you are using a
quantifier on this capture, only the LAST
repetition of the captured pattern will be
stored in \4)
-------------------------------------------------------------------------------- $ before an optional \n, and the end of the
string
C. How Can I Learn to Build a Regex Like That
Here are the resources I recommend.
Books: Mastering Regular Expressions (3rd Ed), the Regex Cookbook
Websites: regular-expressions.info, RexEgg, FAQ on SO
Tools: RegexBuddy (commercial but outstanding debugger), regex101.com, Debuggex.com
Regular expression to validate money with minimum value
Reading the comments, I'm not sure if regex would be your way forward, yet you seem determined. It seems that you are looking to validate a comma-seperated string that needs to start at 20,000, where each second part of the number is 3 digits long. I came up with:
^(?:[2-9]\d|[1-9]\d\d|[1-9],\d{3})(?:,\d{3})+$
See the online demo
^
- Start string ancor.(?:
- Open 1st non-capture group.[2-9]\d
- A digit ranging from 2-9 followed by any digit.|
- Or.[1-9]\d\d
- A digit ranging from 1-9 followed by any two digits.|
- Or.[1-9],\d{3}
- A digit ranging from 1-9 followed by a comma and any three digits.)
- Close 1st non-capture group.
(?:
- Open 2nd non-capture group.,\d{3}
- A comma followed by any three digits.)+
- Close 2nd non-capture group and repeat at least once.
$
- End string ancor.
As an alternative you could also use lookaheads, e.g.:
^(?=.{6,})(?!1.{5}$)[1-9]\d?\d?(?:,\d{3})+$
See the online demo
^
- Start string ancor.(?=.{6,}
- Positive lookahead for at 6 or more characters.(?!1.{5}$)
- Negative lookahead for 1 followed by 5 characters till end string.[1-9]\d?\d?
- A digit ranging from 1-9 followed by two optional digits (you could also write[1-9]\d{0,2}
).(?:
- Open 2nd non-capture group.,\d{3}
- A comma followed by any three digits.)+
- Close non-capture group and repeat at least once.
$
- End string ancor.
Regular Expression for Currency
If you want to disallow 0.00
value, and allow numbers without a digit grouping symbol, you can use
/^(?!0+\.0+$)\d{1,3}(?:,\d{3})*\.\d{2}$/.test(your_str)
See the regex demo
Explanation:
^
- start of string(?!0+\.0+$)
- negative lookahead that fails the match if the input is zero\d{1,3}
- 1 to 3 digits(?:,\d{3})*
- 0+ sequences of a comma followed with 3 digits\.
- a literal dot\d{2}
- 2 digits (decimal part)$
- end of string.
document.body.innerHTML = /^(?!0+\.0+$)\d{1,3}(?:,\d{3}|\d)*\.\d{2}$/.test("1,150.25");document.body.innerHTML += "<br/>" + /^(?!0+\.0+$)\d{1,3}(?:,\d{3}|\d)*\.\d{2}$/.test("0.25");
document.body.innerHTML += "<br/>" + /^(?!0+\.0+$)\d{1,3}(?:,\d{3})*\.\d{2}$/.test("25");document.body.innerHTML += "<br/>" + /^(?!0+\.0+$)\d{1,3}(?:,\d{3})*\.\d{2}$/.test("0.00");document.body.innerHTML += "<br/>" + /^(?!0+\.0+$)\d{1,3}(?:,\d{3})*\.\d{2}$/.test("1150.25");
regex for money values in JavaScript
This should work:
isValid = str.search(/^\$?[\d,]+(\.\d*)?$/) >= 0;
A little more strict with comma placement (would reject 3,2.10, for example):
isValid = str.search(/^\$?\d+(,\d{3})*(\.\d*)?$/) >= 0;
To get a number out of it:
if(isValid) {
var num = Number(str.replace(/[\$,]/g, ''));
...
}
Related Topics
Handling the Null Value from a Resultset
Ora-00942 Sqlexception With Hibernate (Unable to Find a Table)
Subscript and Superscript a String in Android
How to Count Method Calls by Instance
How to Find Max Date in List<Object>
Object Cannot Be Converted to Integer Error
How to Generate a Unique and Short File Name in Java
Unit Testing Private Functions in Junit With Mockito
Classcastexception: Java.Math.Biginteger Cannot Be Cast to Java.Lang.Long on Connect to MySQL
How to Retrieve Element Value from Soap Response Using Java
Gson Expected Begin_Array But Was String At Line 1 Column 62
Sharing a Variable Between Multiple Different Threads
How to Persist a Property of Type List<String> in Jpa
How to Solve Maven 2.6 Resource Plugin Dependency