How to Get the Match Data For All Occurrences of a Ruby Regular Expression in a String

How to match all occurrences of a regex

Using scan should do the trick:

string.scan(/regex/)

How do I get the match data for all occurrences of a Ruby regular expression in a string?

You want

"abc12def34ghijklmno567pqrs".to_enum(:scan, /\d+/).map { Regexp.last_match }

which gives you

[#<MatchData "12">, #<MatchData "34">, #<MatchData "567">] 

The "trick" is, as you see, to build an enumerator in order to get each last_match.

How to access the results of .match as string value in Crystal lang

What if I want to convert to a string merely the first match?

puts "Happy days"[/[a-z]+/i]?
puts "Happy days".match(/[a-z]+/i).try &.[0]

It will try to match a string against /[a-z]+/i regex and if there is a match, Group 0, i.e. the whole match, will be output. Note that the ? after [...] will make it fail gracefully if there is no match found. If you just use puts "??!!"[/[a-z]+/i], an exception will be thrown.

See this online demo.

If you want the functionality similar to String#scan that returns all matches found in the input, you may use (shortened version only left as per @Amadan's remark):

matches = str.scan(re).map(&.string)

Output of the code above:

["Happy days", "Happy days"]

Note that:

  • String::scan will return an array of Regex::MatchData for each match.
  • You can call .string on the match to return the actual matched text.

input.gsub(numbers) { |m| p $~ } Matching data in Ruby for all occurrences in a string

Since I’m the answerer, I would try to explain.

$~ is one of Ruby predefined globals. It returns the MatchData from the previous successful pattern match. It may be accessed using Regexp.last_match as well.

As stated in the documentation, gsub with block is commonly used to modify string, but here we use the fact it calls the codeblock on every match. Block variable m there is a simple string for that match, so whether we need the whole MatchData instance, we should use the predefined global $~. In the mentioned example we simple print each MatchData with p $~.

The trick here is that $~ returns the last MatchData. So, everything you need is to use $~ variable despite it’s repulsive look. Or, you might set:

my_beauty_name_match_data_var = $~

and play with the latter. Hope it helps.

Ruby: Find all occurrences of a pattern in a string, manipulate, and then replace

You can use gsub with a block, so

string.gsub(/\s[aeiou]\w{1,}/i) do |word|
word.upcase
end

How to return first match sub-string of a string using Ruby regex?

scan will return all substrings that matches the pattern. You can use match, scan or [] to achieve your goal:

report_path = '/usr/share/filebeat/reports/ui/local/20200904_151507/API/API_Test_suite/20200904_151508/20200904_151508.csv'

report_path.match(/\d{8}_\d{6}/)[0]
# => "20200904_151507"

report_path.scan(/\d{8}_\d{6}/)[0]
# => "20200904_151507"

# String#[] supports regex
report_path[/\d{8}_\d{6}/]
# => "20200904_151507"

Note that match returns a MatchData object, which may contains multiple matches (if we use capture groups). scan will return an Array containing all matches.

Here we're calling [0] on the MatchData to get the first match


Capture groups:

Regex allow us to capture multiples substring using one patern. We can use () to create capture groups. (?'some_name'<pattern>) allow us to create named capture groups.

report_path = '/usr/share/filebeat/reports/ui/local/20200904_151507/API/API_Test_suite/20200904_151508/20200904_151508.csv'

matches = report_path.match(/(\d{8})_(\d{6})/)
matches[0] #=> "20200904_151507"
matches[1] #=> "20200904"
matches[2] #=> "151507"


matches = report_path.match(/(?'date'\d{8})_(?'id'\d{6})/)
matches[0] #=> "20200904_151507"
matches["date"] #=> "20200904"
matches["id"] #=> "151507"

We can even use (named) capture groups with []

From String#[] documentation:

If a Regexp is supplied, the matching portion of the string is returned. If a capture follows the regular expression, which may be a capture group index or name, follows the regular expression that component of the MatchData is returned instead.

report_path = '/usr/share/filebeat/reports/ui/local/20200904_151507/API/API_Test_suite/20200904_151508/20200904_151508.csv'

# returns the full match if no second parameter is passed
report_path[/(\d{8})_(\d{6})/]
# => 20200904_151507

# returns the capture group n°2
report_path[/(\d{8})_(\d{6})/, 2]
# => 151507

# returns the capture group called "date"
report_path[/(?'date'\d{8})_(?'id'\d{6})/, 'date']
# => 20200904

ruby regex to match multiple occurrences of pattern

Try this:

 => str.match(/\[\[(.*)\]\].*\[\[(.*)\]\]/).captures
=> ["lead:first_name", "client:last_name"]

With many occurrences:

 => str
=> "Some [[lead:first_name]] random text[[lead:first_name]] and more [[lead:first_name]] stuff [[client:last_name]]"
=> str.scan(/\[(\w+:\w+)\]/)
=> [["lead:first_name"], ["lead:first_name"], ["lead:first_name"], ["client:last_name"]]

Looking for Regexp#match_all

See Is there a function like String#scan, but returning array of MatchDatas?

It looks like your best bet is to use String#scan and Regexp.last_match.



Related Topics



Leave a reply



Submit