How to Do Named Capture in Ruby

How to do named capture in ruby

You should use match with named captures, not scan

m = "555-333-7777".match(/(?<area>\d{3})-(?<city>\d{3})-(?<number>\d{4})/)
m # => #<MatchData "555-333-7777" area:"555" city:"333" number:"7777">
m[:area] # => "555"
m[:city] # => "333"

If you want an actual hash, you can use something like this:

m.names.zip(m.captures).to_h # => {"area"=>"555", "city"=>"333", "number"=>"7777"}

Or this (ruby 2.4 or later)

m.named_captures # => {"area"=>"555", "city"=>"333", "number"=>"7777"}

Ruby Named Capture

style, price, stock = <<~_.scan(/^(?:Style |Price \$|Stock \# )(.+)/).flatten
Style 130690 113
Price $335.00
Stock # 932811
_
# => "130690 113", "335.00", "932811"

Ruby - best way to extract regex capture groups?

Since v2.4.6, Ruby has had named_captures, which can be used like this. Just add the ?<some_name> syntax inside a capture group.

/(\w)(\w)/.match("ab").captures # => ["a", "b"]
/(\w)(\w)/.match("ab").named_captures # => {}

/(?<some_name>\w)(\w)/.match("ab").captures # => ["a"]
/(?<some_name>\w)(\w)/.match("ab").named_captures # => {"some_name"=>"a"}

Even more relevant, you can reference a named capture by name!

result = /(?<some_name>\w)(\w)/.match("ab")
result["some_name"] # => "a"

Regex with named capture groups getting all matches in Ruby

Named captures are suitable only for one matching result.

Ruby's analogue of findall is String#scan. You can either use scan result as an array, or pass a block to it:

irb> s = "123--abc,123--abc,123--abc"
=> "123--abc,123--abc,123--abc"

irb> s.scan(/(\d*)--([a-z]*)/)
=> [["123", "abc"], ["123", "abc"], ["123", "abc"]]

irb> s.scan(/(\d*)--([a-z]*)/) do |number, chars|
irb* p [number,chars]
irb> end
["123", "abc"]
["123", "abc"]
["123", "abc"]
=> "123--abc,123--abc,123--abc"

Named capture group doesn't work with dynamic regex

The problem with the first approach is that using string interpolation in the regex literal disables the assignment of the local variables. From Regexp#=~:

If =~ is used with a regexp literal with named captures, captured strings (or nil) is assigned to local variables named by the capture names.

... snipped...

This assignment is implemented in the Ruby parser. The parser detects ‘regexp-literal =~ expression’ for the assignment. The regexp must be a literal without interpolation and placed at left hand side.

... snipped ...

A regexp interpolation, #{}, also disables the assignment.

You can always just use Regexp#match to get the captures, but I'm not sure of anyway to automatically assign local variables like this (honestly I didn't know =~ would do so):

match_data = /(?<g1>#{permitted_keys.join('|')})_content_type/.match(key)
match_data['g1']
# => "banner"

or if you like dealing with globals:

/(?<g1>#{permitted_keys.join('|')})_content_type/ =~ key
$~['g1']
# => "banner"

Using named capture groups inside Ruby gsub blocks (regex)

You are looking for

"foo /(bar)".gsub(/(?<my_word> \(.*?\) )/x) do |match|
puts "$1 = #{$1} and $my_word = #{$~[:my_word]}"
end

Using Named Captures with regex match in Ruby's case...when?

named captures set local variables when this syntax.

regex-literal =~ string

Dosen't set in other syntax. # See rdoc(re.c)

regex-variable =~ string

string =~ regex

regex.match(string)

case string
when regex
else
end

I like named captures too, but I don't like this behavior.
Now, we have to use $~ in case syntax.

case string
when /(?<name>.)/
$~[:name]
else
end

Optional named group in Ruby RegExp

You need to add (?:\s+(?<http_x_forwarded_for>\S+))? optional non-capturing group after the last field pattern. That means the named capturing group should be inside an optional non-capturing one, and \s+ should be placed before it to take into account any 1+ whitespace chars before the field.

Use

^(?<remote>\S*) (?<host>\S*) (?<user>\S*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^"]*?)(?:\s+\S*)?)?" (?<code>\S*) (?<size>\S*)(?: "(?<referer>[^"]*)" "(?<agent>[^"]*)"(?:\s+(?<http_x_forwarded_for>\S+))?)?$

See the regex demo.

Note I replaced [^ ] with \S that is more natural to match chars other than whitespace chars with regex.

Why does capturing named groups in Ruby result in undefined local variable or method errors?

Named Captures Must Use Literals

You are encountering some limitations of Ruby's regular expression library. The Regexp#=~ method limits named captures as follows:

  • The assignment does not occur if the regexp is not a literal.
  • A regexp interpolation, #{}, also disables the assignment.
  • The assignment does not occur if the regexp is placed on the right hand side.

You'll need to decide whether you want named captures or interpolation in your regular expressions. You currently cannot have both.



Related Topics



Leave a reply



Submit