How to do named capture in ruby
You should use match
with named captures, not scan
m = "555-333-7777".match(/(?<area>\d{3})-(?<city>\d{3})-(?<number>\d{4})/)
m # => #<MatchData "555-333-7777" area:"555" city:"333" number:"7777">
m[:area] # => "555"
m[:city] # => "333"
If you want an actual hash, you can use something like this:
m.names.zip(m.captures).to_h # => {"area"=>"555", "city"=>"333", "number"=>"7777"}
Or this (ruby 2.4 or later)
m.named_captures # => {"area"=>"555", "city"=>"333", "number"=>"7777"}
Ruby Named Capture
style, price, stock = <<~_.scan(/^(?:Style |Price \$|Stock \# )(.+)/).flatten
Style 130690 113
Price $335.00
Stock # 932811
_
# => "130690 113", "335.00", "932811"
Ruby - best way to extract regex capture groups?
Since v2.4.6, Ruby has had named_captures
, which can be used like this. Just add the ?<some_name>
syntax inside a capture group.
/(\w)(\w)/.match("ab").captures # => ["a", "b"]
/(\w)(\w)/.match("ab").named_captures # => {}
/(?<some_name>\w)(\w)/.match("ab").captures # => ["a"]
/(?<some_name>\w)(\w)/.match("ab").named_captures # => {"some_name"=>"a"}
Even more relevant, you can reference a named capture by name!
result = /(?<some_name>\w)(\w)/.match("ab")
result["some_name"] # => "a"
Regex with named capture groups getting all matches in Ruby
Named captures are suitable only for one matching result.
Ruby's analogue of findall
is String#scan
. You can either use scan
result as an array, or pass a block to it:
irb> s = "123--abc,123--abc,123--abc"
=> "123--abc,123--abc,123--abc"
irb> s.scan(/(\d*)--([a-z]*)/)
=> [["123", "abc"], ["123", "abc"], ["123", "abc"]]
irb> s.scan(/(\d*)--([a-z]*)/) do |number, chars|
irb* p [number,chars]
irb> end
["123", "abc"]
["123", "abc"]
["123", "abc"]
=> "123--abc,123--abc,123--abc"
Named capture group doesn't work with dynamic regex
The problem with the first approach is that using string interpolation in the regex literal disables the assignment of the local variables. From Regexp#=~
:
If
=~
is used with a regexp literal with named captures, captured strings (ornil
) is assigned to local variables named by the capture names.... snipped...
This assignment is implemented in the Ruby parser. The parser detects ‘regexp-literal =~ expression’ for the assignment. The regexp must be a literal without interpolation and placed at left hand side.
... snipped ...
A regexp interpolation,
#{}
, also disables the assignment.
You can always just use Regexp#match
to get the captures, but I'm not sure of anyway to automatically assign local variables like this (honestly I didn't know =~
would do so):
match_data = /(?<g1>#{permitted_keys.join('|')})_content_type/.match(key)
match_data['g1']
# => "banner"
or if you like dealing with globals:
/(?<g1>#{permitted_keys.join('|')})_content_type/ =~ key
$~['g1']
# => "banner"
Using named capture groups inside Ruby gsub blocks (regex)
You are looking for
"foo /(bar)".gsub(/(?<my_word> \(.*?\) )/x) do |match|
puts "$1 = #{$1} and $my_word = #{$~[:my_word]}"
end
Using Named Captures with regex match in Ruby's case...when?
named captures set local variables when this syntax.
regex-literal =~ string
Dosen't set in other syntax. # See rdoc(re.c)
regex-variable =~ string
string =~ regex
regex.match(string)
case string
when regex
else
end
I like named captures too, but I don't like this behavior.
Now, we have to use $~ in case syntax.
case string
when /(?<name>.)/
$~[:name]
else
end
Optional named group in Ruby RegExp
You need to add (?:\s+(?<http_x_forwarded_for>\S+))?
optional non-capturing group after the last field pattern. That means the named capturing group should be inside an optional non-capturing one, and \s+
should be placed before it to take into account any 1+ whitespace chars before the field.
Use
^(?<remote>\S*) (?<host>\S*) (?<user>\S*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^"]*?)(?:\s+\S*)?)?" (?<code>\S*) (?<size>\S*)(?: "(?<referer>[^"]*)" "(?<agent>[^"]*)"(?:\s+(?<http_x_forwarded_for>\S+))?)?$
See the regex demo.
Note I replaced [^ ]
with \S
that is more natural to match chars other than whitespace chars with regex.
Why does capturing named groups in Ruby result in undefined local variable or method errors?
Named Captures Must Use Literals
You are encountering some limitations of Ruby's regular expression library. The Regexp#=~ method limits named captures as follows:
- The assignment does not occur if the regexp is not a literal.
- A regexp interpolation,
#{}
, also disables the assignment. - The assignment does not occur if the regexp is placed on the right hand side.
You'll need to decide whether you want named captures or interpolation in your regular expressions. You currently cannot have both.
Related Topics
Is the Unix Philosophy Falling Out of Favor in the Ruby Community
Can't Install Rmagick Gem on Ubuntu 13.04
Ruby: Remove Whitespace Chars at the Beginning of a String
Read Contents of a Local File into a Variable in Rails
Xpath to Find All Following Siblings Up Until the Next Sibling of a Particular Type
Memory Usage Increase with Ruby 2.1 Versus Ruby 2.0 or 1.9
Ruby Operator Overloading Question
Garbage Collector Tuning in Ruby 1.9
Equivalent of Iconv.Conv("Utf-8//Ignore",...) in Ruby 1.9.X
Understanding Rails Instance Variables
Passenger: Cannot Load Such File Rubygems/Builder
Create Hash from Array and Frequency
Devise Authentication Gem: How to Save the Logged in User Id
Rspec: Testing Assignment of Instance Variable