Named Capturing Groups in JavaScript Regex

Named capturing groups in JavaScript regex?

ECMAScript 2018 introduces named capturing groups into JavaScript regexes.

Example:

  const auth = 'Bearer AUTHORIZATION_TOKEN'
const { groups: { token } } = /Bearer (?<token>[^ $]*)/.exec(auth)
console.log(token) // "AUTHORIZATION_TOKEN"

If you need to support older browsers, you can do everything with normal (numbered) capturing groups that you can do with named capturing groups, you just need to keep track of the numbers - which may be cumbersome if the order of capturing group in your regex changes.

There are only two "structural" advantages of named capturing groups I can think of:

  1. In some regex flavors (.NET and JGSoft, as far as I know), you can use the same name for different groups in your regex (see here for an example where this matters). But most regex flavors do not support this functionality anyway.

  2. If you need to refer to numbered capturing groups in a situation where they are surrounded by digits, you can get a problem. Let's say you want to add a zero to a digit and therefore want to replace (\d) with $10. In JavaScript, this will work (as long as you have fewer than 10 capturing group in your regex), but Perl will think you're looking for backreference number 10 instead of number 1, followed by a 0. In Perl, you can use ${1}0 in this case.

Other than that, named capturing groups are just "syntactic sugar". It helps to use capturing groups only when you really need them and to use non-capturing groups (?:...) in all other circumstances.

The bigger problem (in my opinion) with JavaScript is that it does not support verbose regexes which would make the creation of readable, complex regular expressions a lot easier.

Steve Levithan's XRegExp library solves these problems.

Javascript: Named Capture Groups

JavaScript does not support named capture groups.

You will have to use numbered groups.

For instance:

var myregex = /([^=]+)=(.*)/;
var matchArray = myregex.exec(yourString);
if (matchArray != null) {
element = matchArray[1];
id = matchArray[2];

}

Option 2: XRegExp

The alternate regex library for JavaScript XregexP supports named captures as well as other important regex features missing from JS regex, such as lookbehinds.

Can you forward-reference a named capture group in JS regex?

Use capture groups around both occurrences. Then you can copy one to the replacement to keep it, while replacing the other.

string.replace(/([tns])(\1)/g, "X$2"); // replace first with X
string.replace(/([tns])(\1)/g, "$1X"); // replace second with X

Overlapping named capturing groups

Sure, just place gender inside the style group:

const validateMpn = (mpn) => {  const regex = /(?<style>(?<gender>\d{2})\d{4})(?<width>\d{1}[ABDE])(?<color_code>\d{3})\.(?<size_code>\d{3})/gi  const match = regex.exec(mpn)
if (!match) { return null }
return match.groups}
const str1 = '1102961D048.075'const str2 = '1200322A001.085'const match1 = validateMpn(str1)const match2 = validateMpn(str2)
console.log(match1)console.log(match2)

How do you destructure named capture groups?

It doesn't compile because groups could be null.

No, it doesn't compile because .exec() can return null, when the regex does not match. Attempting to access a property like .groups on that will cause a TypeError: Cannot read properties of null.

You'll need a fallback value (using nullish coalescing and default initialisers) to destructure in that case:

const { groups: {token} = {} } = /Bearer (?<token>[^ $]*)/.exec(auth) ?? {}

or simpler with optional chaining:

const { token } = /Bearer (?<token>[^ $]*)/.exec(auth)?.groups ?? {}

use named regex groups to output an array of matches

A regex match object will only contain one string for a given named capture group. For what you're trying to do, you'll have to do it in two steps: first separate out the parts of the input, then map it to the array of objects while checking which group was captured to identify the sort of group it needs:

const str = '{hello} good {sir}, a [great] sunny [day] to you.';
const matches = [...str.matchAll(/{([^{]+)}|\[([^\]]+)\]|([^[{]+)/g)]
.map(match => ({
group: match[1] ? 'braces' : match[2] ? 'brackets' : 'other',
word: match[1] || match[2] || match[3]
}));

console.log(matches);

Regex Group Capture

Capture groups are provided in the match array starting at index 1:

var str = "<br><strong>Name:</strong> John Smith<br>";var re = /\<strong>Name\s*:\<\/strong>\s*([^\<]*)/gmatch = re.exec(str);while (match != null) {    console.log(match[1]); // <====    match = re.exec(str);}

Javascript global match with capturing groups

As per MDN docs :

If the regular expression does not include the g flag, returns the same result as RegExp.exec(). The returned Array has an extra input property, which contains the original string that was parsed. In addition, it has an index property, which represents the zero-based index of the match in the string.

If the regular expression includes the g flag, the method returns an Array containing all matched substrings rather than match objects. Captured groups are not returned. If there were no matches, the method returns null.


If you want to obtain capture groups and the global flag is set, you need to use RegExp.exec() instead.

var myRe = /(\d)(\d)/g;
var str = '12 34';
var myArray;
while (myArray = myRe.exec(str)) {
console.log(myArray);
}

How to manipulate regex named capturing groups

You don't have any named capturing groups here, only plain capturing groups - use a replacer function that replaces with the first captured group, concatenated with the second capturing group cast to a number plus 1:

const data = `firstGroup1firstGroup33`;
const result = data.replace( /(firstGroup)(\d+)/g, (_, first, num) => first + (Number(num) + 1));console.log(result);

Named capture groups in the function version of `replace()`

The linked MDN documentation from the question now describes this new parameter, and explicitly defines exactly what was observed:

The function has the following signature:

function replacer(match, p1, p2, /* …, */ pN, offset, string, groups) {
return replacement;
}

...

groups

An object whose keys are the used group names, and whose values are the matched portions (undefined if not matched). Only present if the pattern contains at least one named capturing group.

The exact number of arguments depends on whether the first argument is a RegExp object — and, if so, how many capture groups it has.


What's interesting is the browser compatibility section at the bottom makes no effort to differentiate between this new feature (which didn't work in Firefox at the time), simply saying "Firefox 1 full support" even though other mdn articles are usually very good at exhaustively listing browsers and at what version they each supported all the different sub-features separately.



Related Topics



Leave a reply



Submit