$1 and \1 in Ruby

$1 and \1 in Ruby

\1 is a backreference which will only work in the same sub or gsub method call, e.g.:

"foobar".sub(/foo(.*)/, '\1\1') # => "barbar"

$1 is a global variable which can be used in later code:

if "foobar" =~ /foo(.*)/ then 
puts "The matching word was #{$1}"
end

Output:

"The matching word was bar"
# => nil

Ruby: What does $1 mean?

According to Avdi Grimm from RubyTapas

$1 is a global variable which can be used in later code:

 if "foobar" =~ /foo(.*)/ then 
puts "The matching word was #{$1}"
end

Output:

"The matching word was bar"

In short, $1, $2, $... are the global-variables used by some of the ruby library functions specially concerning REGEX to let programmers use the findings in later codes.

See this for such more variables available in Ruby.

\\1 vs. $1. Why am i getting different results in Ruby

The problem is that $1 is a reference to the match of the first group from the last regex match. It's value is evaluated at the time of passing it to the method (String#sub!) and not after the matching is done.

Therefore, the t comes from your previous experiment with \1. If you open a fresh repl and run your second example, you will get TypeError: no implicit conversion of nil into String. This is because $1 is nil at the time you call the first String#sub!.

Using $1, $2, etc. global variables inside method definition

Why the output is different?

A proc in ruby has lexical scope. This means that when it finds a variable that is not defined, it is resolved within the context the proc was defined, not called. This explains the behavior of your code.

You can see the block is defined before the regexp, and this can cause confusion. The problem involves a magic ruby variable, and it works quite differently than other variables. Citing @JörgWMittag

It's rather simple, really: the reason why $SAFE doesn't behave like you would expect from a global variable is because it isn't a global variable. It's a magic unicorn thingamajiggy.

There are quite a few of those magic unicorn thingamajiggies in Ruby, and they are unfortunately not very well documented (not at all documented, in fact), as the developers of the alternative Ruby implementations found out the hard way. These thingamajiggies all behave differently and (seemingly) inconsistently, and pretty much the only two things they have in common is that they look like global variables but don't behave like them.

Some have local scope. Some have thread-local scope. Some magically change without anyone ever assigning to them. Some have magic meaning for the interpreter and change how the language behaves. Some have other weird semantics attached to them.

If you are really up to find exactly how the $1 and $2 variables work, I assume the only "documentation" you will find is rubyspec, that is a spec for ruby done the hard way by the Rubinus folks. Have a nice hacking, but be prepared for the pain.



Is there a way to pass a block to gsub from another context with $1, $2 variables setup the right way?

You can achieve what you want with this following modification (but I bet you already know that)

require 'pp'
def hello(z)
#z = proc {|m| pp $1}
"hello".gsub(/(o)/, &z)
end
z = proc {|m| pp m}
hello(z)

I'm not aware of a way to change the scope of a proc on the fly. But would you really want to do this?

How do I modify \1 in gsub?

Ok, thanks to the comment by Narfanator I found the following: "$1 and \1 in Ruby".

The solutions was super easy:

s = "this is a string which has Text 123 1234 12345 "
s = s.s.gsub(/(Text \d+ \d+ \d+)/){|x| "\"" + x + "\":https://site.com/query?value=" + CGI::escape(x)}

What are Ruby's numbered global variables

They're captures from the most recent pattern match (just as in Perl; Ruby initially lifted a lot of syntax from Perl, although it's largely gotten over it by now :). $1, $2, etc. refer to parenthesized captures within a regex: given /a(.)b(.)c/, $1 will be the character between a and b and $2 the character between b and c. $` and $' mean the strings before and after the string that matched the entire regex (which is itself in $&), respectively.

There is actually some sense to these, if only historically; you can find it in perldoc perlvar, which generally does a good job of documenting the intended mnemonics and history of Perl variables, and mostly still applies to the globals in Ruby. The numbered captures are replacements for the capture backreference regex syntax (\1, \2, etc.); Perl switched from the former to the latter somewhere in the 3.x versions, because using the backreference syntax outside of the regex complicated parsing too much. (By the time Perl 5 rolled around, the parser had been sufficiently rewritten that the syntax was again available, and promptly reused for references/"pointers". Ruby opted for using a name-quote : instead, which is closer to the Lisp and Smalltalk style; since Ruby started out as a Perl-alike with Smalltalk-style OO, this made more sense linguistically.) The same applies to $&, which in historical regex syntax is simply & (but you can't use that outside the replacement part of a substitution, so it became a variable $& instead). $` and $' are both "cutesy": "back-quote" and "forward-quote" from the matched string.

Finding the first duplicate character in the string Ruby

s.chars.map { |c| [c, s.count(c)] }.drop_while{|i| i[1] <= 1}.first[0]

With the refined form from Cary Swoveland :

s.each_char.find { |c| s.count(c) > 1 }

Meaning of \\1 in madlib.gsub!(/\(\(\s*(.+?)\s*\)\)/, %= q_to_a('\\1') % )

Say you were looking for a string that contained two identical words separated by a white space. You might begin by writing (.+?)\s, but you would need some way to represent the same content that the first set of parentheses was matched to. This is exactly what \n (or \\n) does. It refers to the content of the nth set of parentheses. So if you wanted to extract the "ef ef" from "ab cd ef ef gh ik" you could use the following regex expression

(.+?)\s\1

In your particular case, as other have pointed out, the \\1 refers specifically to the content of (.+?). Note that all the other parentheses are preceded by escape characters to match to actual parentheses in the text and so they are not taken into account.

Evaluating expressions in Ruby

The fact your second example does not work, has nothing to do with the order in which the if is being evaluated.

Instead, use \1 and \2 as interpolation variables.

str = "This is a string"
str.gsub!(/(\w+)/, '1\1')
print str

Also note that "\1" is interpolated, you need '\1'.



Related Topics



Leave a reply



Submit