How to Access Named Capturing Groups in a .Net Regex

How do I access named capturing groups in a .NET Regex?

Use the group collection of the Match object, indexing it with the capturing group name, e.g.

foreach (Match m in mc){
MessageBox.Show(m.Groups["link"].Value);
}

What is the regex pattern for named capturing groups in .NET?

 string pattern = @"(?<Person>[\w ]+) has been to (?<NumberOfGames>\d+) bingo games\. The last was on (?<Day>\w+) (?<Date>\d\d/\d\d/\d{4})\. She won with the Numbers: (?<Numbers>.*?)$";

Other posts have mentioned how to pull out the groups, but this regex matches on your input.

How do I access named capturing groups in a .NET Regex?

Use the group collection of the Match object, indexing it with the capturing group name, e.g.

foreach (Match m in mc){
MessageBox.Show(m.Groups["link"].Value);
}

Using a named capture groups causes a different match

See .NET Grouping Constructs regex docs:

Named matched subexpressions are numbered consecutively from left to right after matched subexpressions.

So, your pattern groups are parsed in this order:

^(Start\.)?(?<capturedGroup>.+?)(\.|\.\2)?(End)?$
^---1---^ ^-------- 4 --------^^---2---^ ^-3-^

When debugging, you may check the real group numeric IDs:

Sample Image

You just need to either use the named group backreference, \k<capturedGroup>, or use \4 instead of \2 (which is not that intuitive, so I'd rather you use the former solution).

  • ^(Start\.)?(?<capturedGroup>.+?)(\.|\.\k<capturedGroup>)?(End)?$ - Demo 1
  • ^(Start\.)?(?<capturedGroup>.+?)(\.|\.\4)?(End)?$ - Demo 2

Output:

Sample Image

RegEx Capturing Groups in C#

One way we might like to try is to test that if our expression would be working in another language.

Also, we might want to simplify our expression:

^(.*?)([\s:]+)?{([\s\S].*)?.$

where we have three capturing groups. The first and third ones are our desired key and values.

Sample Image

RegEx

You can modify/simplify/change your expressions in regex101.com.

RegEx Circuit

You can also visualize your expressions in jex.im:

Sample Image

JavaScript Demo

const regex = /^(.*?)([\s:]+)?{([\s\S].*)?.$/gm;const str = `key:{value}key:{valu{0}e}key:{valu{0}e}key:   {val-u{0}e}key:  {val__[!]-u{0}{1}e}`;const subst = `$1,$3`;
// The substituted value will be contained in the result variableconst result = str.replace(regex, subst);
console.log('Substitution result: ', result);

Trying to get multiple RegEx matches with named groups in C#

You need to iterate through all the matches of the string. Regex.Match will only return the first match.

public static string[] ValidatePattern(string pattern, string input, List<string> groupNames)
{
Regex regex = new Regex(pattern);
var matches = regex.Matches(input);

List<string> results = new List<string>();
foreach (Match match in matches) {
foreach (var name in groupNames)
{
var group = match.Groups[name];
results.Add(group.Success ? group.Value : string.Empty);
}
}
return results.ToArray();
}

Named capturing group to different names

You can nest the named capturing groups and use

^(?<someName>[^ ]*) (?<someNewFancyNae>(?<someNameWeWishToDeprecate>[^ ]*))$

Or,

^(?<someName>\S+)\s+(?<someNewFancyNae>(?<someNameWeWishToDeprecate>\S+))$

See the regex demo.

Sample Image

How do I make Regex capture only named groups

You always have group 0: that's the entire match. Numbered groups are relative to 1 based on the ordinal position of the opening parenthesis that defines the group. Your regular expression (formatted for clarity):

(?<code>
^
(?<l1> [\d]{2} )
/
(?<l2> [\d]{3} )
/
(?<l3> [\d]{2} )
$
|
^
(?<l1>[\d]{2})
/
(?<l2>[\d]{3})
$
|
(?<l1> ^[\d]{2} $ )
)

Your expression will backtrack, so you might consider simplifying your regular expression. This is probably clearer and more efficient:

static Regex rxCode = new Regex(@"
^ # match start-of-line, followed by
(?<code> # a mandatory group ('code'), consisting of
(?<g1> \d\d ) # - 2 decimal digits ('g1'), followed by
( # - an optional group, consisting of
/ # - a literal '/', followed by
(?<g2> \d\d\d ) # - 3 decimal digits ('g2'), followed by
( # - an optional group, consisting of
/ # - a literal '/', followed by
(?<g3> \d\d ) # - 2 decimal digits ('g3')
)? # - END: optional group
)? # - END: optional group
) # - END: named group ('code'), followed by
$ # - end-of-line
" , RegexOptions.IgnorePatternWhitespace|RegexOptions.ExplicitCapture );

Once you have that, something like this:

string[] texts = { "12" , "12/345" , "12/345/67" , } ;

foreach ( string text in texts )
{
Match m = rxCode.Match( text ) ;
Console.WriteLine("{0}: match was {1}" , text , m.Success ? "successful" : "NOT successful" ) ;
if ( m.Success )
{
Console.WriteLine( " code: {0}" , m.Groups["code"].Value ) ;
Console.WriteLine( " g1: {0}" , m.Groups["g1"].Value ) ;
Console.WriteLine( " g2: {0}" , m.Groups["g2"].Value ) ;
Console.WriteLine( " g3: {0}" , m.Groups["g3"].Value ) ;
}
}

produces the expected

12: match was successful
code: 12
g1: 12
g2:
g3:
12/345: match was successful
code: 12/345
g1: 12
g2: 345
g3:
12/345/67: match was successful
code: 12/345/67
g1: 12
g2: 345
g3: 67

.net regex optional named groups

You can match subparts using [^/]+ pattern (any 1+ chars other than / char) and make the numbered capturing group rather than the value named capturing group optional, i.e. (/(?<value>.+)?) => (?:/(?<value>.+))? (also, you may turn the capturing group into non-capturing, or use (?n) inline ExplicitCapture modifier to make all capturing groups behave as non-capturing).

You may use

^client/(?<id>[^/]+)/box/(?<type>[^/]+)(?:/(?<value>[^/]+))?

See the regex demo

Sample Image

How do I get the name of captured groups in a C# Regex?

Use GetGroupNames to get the list of groups in an expression and then iterate over those, using the names as keys into the groups collection.

For example,

GroupCollection groups = regex.Match(line).Groups;

foreach (string groupName in regex.GetGroupNames())
{
Console.WriteLine(
"Group: {0}, Value: {1}",
groupName,
groups[groupName].Value);
}


Related Topics



Leave a reply



Submit