Regular Expression to Match Exactly 5 Digits

regular expression to match exactly 5 digits

I am reading a text file and want to use regex below to pull out numbers with exactly 5 digit, ignoring alphabets.

Try this...

var str = 'f 34 545 323 12345 54321 123456',
matches = str.match(/\b\d{5}\b/g);

console.log(matches); // ["12345", "54321"]

jsFiddle.

The word boundary \b is your friend here.

Update

My regex will get a number like this 12345, but not like a12345. The other answers provide great regexes if you require the latter.

Regex for a five-digit numerical String

The regex you are using: [0-9]{5} is quite close to what you wish to achieve. The problem is that you are just saying match 5 digits.

What you would need to do, would be to simply add the ^ and $ anchors to instruct the regex engine to make sure that the string is made up entirely of what you want. Thus, [0-9]{5} becomes ^[0-9]{5}$ or ^\d{5}$.

Regex to match exactly 5 numbers and one optional space

Note: Using a regular expression to solve this problem might not be
the best answer. As answered
below, it may be
easier to just count the digits and spaces with a simple function!

However, since the question was asking for a regex answer, and in some
scenarios you may be forced to solve this with a regex (e.g. if
you're tied down to a certain library's implementation), the following
answer may be helpful:

This regex matches lines containing exactly 5 digits:

^(?=(\D*\d){5}\D*$)

This regex matches lines containing one optional space:

^(?=[^ ]* ?[^ ]*$)

If we put them together, and also ensure that the string contains only digits and spaces ([\d ]*$), we get:

^(?=(\D*\d){5}\D*$)(?=[^ ]* ?[^ ]*$)[\d ]*$

You could also use [\d ]{5,6} instead of [\d ]* on the end of that pattern, to the same effect.

Demo

Explanation:

This regular expression is using lookaheads. These are zero-width pattern matchers, which means both parts of the pattern are "anchored" to the start of the string.

  • \d means "any digit", and \D means "any non-digit".

  • means "space", and [^ ] means "any non-space".

  • The \D*\d is being repeated 5 times, to ensure exactly 5 digits are in the string.

Here is a visualisation of the regex in action:

regex visualisation

Note that if you actually wanted the "optional space" to include things like tabs, then you could instead use \s and \S.


Update: Since this question appears to have gotten quite a bit of traction, I wanted to clarify something about this answer.

There are several "simpler" variant solutions to my answer above, such as:

// Only look for digits and spaces, not "non-digits" and "non-spaces":
^(?=( ?\d){5} *$)(?=\d* ?\d*$)

// Like above, but also simplifying the second lookahead:
^(?=( ?\d){5} *$)\d* ?\d*

// Or even splitting it into two, simpler, problems with an "or" operator:
^(?:\d{5}|(?=\d* \d*$).{6})$

Demos of each line above: 1 2 3

Or even, if we can assume that the string is no more than 6 characters then even just this is sufficient:

^(?:\d{5}|\d* \d*)$

So with that in mind, why might you want to use the original solution, for similar problems? Because it's generic. Look again at my original answer, re-written with free-spacing:

^
(?=(\D*\d){5}\D*$) # Must contain exactly 5 digits
(?=[^ ]* ?[^ ]*$) # Must contain 0 or 1 spaces
[\d ]*$ # Must contain ONLY digits and spaces

This pattern of using successive look-aheads can be used in various scenarios, to write patterns that are highly structured and (perhaps surprisingly) easy to extend.

For example, suppose the rules changed and you now wanted to match 2-3 spaces, 1 . and any number of hyphens. It's actually very easy to update the regex:

^
(?=(\D*\d){5}\D*$) # Must contain exactly 5 digits
(?=([^ ]* ){2,3}[^ ]*$) # Must contain 2 or 3 spaces
(?=[^.]*\.[^.]*$) # Must contain 1 period
[\d .-]*$ # Must contain ONLY digits, spaces, periods and hyphens

...So in summary, there are "simpler" regex solutions, and quite possibly a better non-regex solution to OP's specific problem. But what I have provided is a generic, extensible design pattern for matching patterns of this nature.

A regex to allow exactly 5 digit number with one optional white space before and after the number?

I might suggest just using lookarounds here:

/(?<![ ]{2})\b\d{5}\b(?![ ]{2})/

This pattern says to:

(?<![ ]{2})  assert that 2 (or more) spaces do NOT precede the ZIP code
\b\d{5}\b match a 5 digit ZIP code
(?![ ]{2}) assert that 2 (or more) spaces do not follow

Here is a demo showing that the pattern works.

matching all 5 digits numbers other than 3 specific ones-without negative lookbehind

This will match all 5 digit numbers excluding your few listed

excludes numbers 10000, 11000, and 68000

The ranges are:

00000 - 09999

10001 - 10999

11001 - 67999

68001 - 99999

^(?:0\d{4}|(?:1000[1-9]|100[1-9]\d|10[1-9]\d{2})|(?:1100[1-9]|110[1-9]\d|11[1-9]\d{2}|1[2-9]\d{3}|[2-5]\d{4}|6[0-7]\d{3})|(?:6800[1-9]|680[1-9]\d|68[1-9]\d{2}|69\d{3}|[7-9]\d{4}))$

viewing

 ^ 
(?:
0 \d{4}
| (?:
1000 [1-9]
| 100 [1-9] \d
| 10 [1-9] \d{2}
)
| (?:
1100 [1-9]
| 110 [1-9] \d
| 11 [1-9] \d{2}
| 1 [2-9] \d{3}
| [2-5] \d{4}
| 6 [0-7] \d{3}
)
| (?:
6800 [1-9]
| 680 [1-9] \d
| 68 [1-9] \d{2}
| 69 \d{3}
| [7-9] \d{4}
)
)
$

Check if a string consists of exactly 5 digits in Java

For sake of completeness (even though the question has changed completely)

boolean b = matcher.find();

This will match if the regex is contained somewhere in the matching string. If you use matcher.matches you will get the expected behaviour, where it must match the ENTIRE string.

Alternatively you can skip the compile step(not recommended if this regex is going to be used several times.) altogether and just write:

String regex = "\\d{5}";
String test = "123456";
if(test.matches(regex)){ ... };

Which is essentially what you had in the original question.

How can I match a number that is at least 5-digits-long or more using RegEx?

You can use \d{5,}, which matches 5 digits or more, then:

  • If you want this number as a word use \b\d{5,}\b. \b matches word boundaries.
  • If that’s a number on its own line, use ^\d{5,}$. ^ matches the beginning of the line while $ matches its end.

Here is an example.

R Regexp - extract number with 5 digits

You could simply use sub to grab the digits, IMO regmatches is not necessary for this simple case.

x <- 'stundenwerte_FF_00691_19260101_20131231_hist.zip'
sub('\\D*(\\d{5}).*', '\\1', x)
# [1] "00691"

Edit: If you have other strings that contain digits in front, you would slightly modify the expression.

sub('.*_(\\d{5})_.*', '\\1', x)


Related Topics



Leave a reply



Submit