The Ruby %R{ } Expression

The Ruby %r{ } expression

%r{} is equivalent to the /.../ notation, but allows you to have '/' in your regexp without having to escape them:

%r{/home/user}

is equivalent to:

/\/home\/user/

This is only a syntax commodity, for legibility.

Edit:

Note that you can use almost any non-alphabetic character pair instead of '{}'.
These variants work just as well:

%r!/home/user!
%r'/home/user'
%r(/home/user)

Edit 2:

Note that the %r{}x variant ignores whitespace, making complex regexps more readable. Example from GitHub's Ruby style guide:

regexp = %r{
start # some text
\s # white space char
(group) # first group
(?:alt1|alt2) # some alternation
end
}x

Ruby Rails %r and %w

A Regexp holds a regular expression, used to match a pattern against strings. Regexps are created using the /.../ and %r{...} literals, and by the Regexp::new constructor.

%r and %w seem to be doing the same thing so I'm confused..

%w{ fred.gif fred.jpg FRED.Jpg}
# => ["fred.gif", "fred.jpg", "FRED.Jpg"]
%r{ a b }
# => / a b /

No. They are not same, as you can see above.

One thing I noticed with %r{}, as you don't need to escape slashes.

# /../ literals:
url.match /http:\/\/example\.com\//
# => #<MatchData "http://example.com/">

# %r{} literals:
url.match %r{http://example\.com/}
# => #<MatchData "http://example.com/">

Use %r only for regular expressions matching more than one '/' character.

# bad
%r(\s+)

# still bad
%r(^/(.*)$)
# should be /^\/(.*)$/

# good
%r(^/blog/2011/(.*)$)

Why does %r{ around } my Regex break my gsub?

Ruby allows you to begin your regular expressions with %r followed by a delimiter of your choice. This is useful when the pattern you are describing contains a lot of forward-slash characters because these slashes do not need to be escaped in that syntax.

Therefore /\A\/#{locale}\/?/ and %r{/\A\/#{locale}\/?/} are not equal, use %r{\A/#{locale}/?} instead.

r = /\A\/#{locale}\/?/

r == %r{/\A\/#{locale}\/?/}
#=> false
r == %r{\A/#{locale}/?}
#=> true

Why use %r only for regular expressions matching more than one '/' character?

apparently you aren't supposed to use it if there's only one forward slash in the regex.

This is not true.

I've seen this mentioned in multiple Ruby style guides (namely here and here)

You are jumping to your conclusion based on some limited observation. It is just these people's own decision. Whomever feel sympathy with these people might attempt to spread this practice though.

However, I see some rationale in such claim. What I think is that in various occasions, Ruby has several ways to express the same thing. Randomly choosing one way or another makes the code difficult to read and can induce human errors. So we should stick to a single notation when possible. This applies to regex literal as well. Since // is the most concise and the unmarked regex literal, we should stick to it whenever possible.

Whether to use %r notation should thus depend on whether the pros (avoiding the necessity to escape slashes) outweighs the cons (departing from using the standard // notation and/or using a longer notation). It seems that those people judged that a single slash (single occurrence of escape) does not make the pros sufficient to outweigh the cons, but two or more do. And that makes sense as the %r{} notation takes two more characters than //, so it becomes a tie when the latter needs two escaping, and beyond that, %r{} becomes the shorter notation.

How %r(..) differs from /../ in Regexp creation in Ruby?

There is absolutely no difference in %r/foo/ and /foo/.


irb(main):001:0> %r[foo]
=> /foo/
irb(main):002:0> %r{foo}
=> /foo/
irb(main):003:0> /foo/
=> /foo/

The source script will be analyzed by the interpreter at startup and both will be converted to a regexp, which, at run-time, will be the same.

The only difference is the source-code, not the executable. Try this:

require 'benchmark'

str = (('a'..'z').to_a * 256).join + 'foo'
n = 1_000_000

puts RUBY_VERSION, n
puts

Benchmark.bm do |b|
b.report('%r') { n.times { str[%r/foo/] } }
b.report('/') { n.times { str[/foo/] } }
end

Which outputs:

1.9.3
1000000

user system total real
%r 8.000000 0.000000 8.000000 ( 8.014767)
/ 8.000000 0.000000 8.000000 ( 8.010062)

That's on an old MacBook Pro running 10.8.2. Think about it, that's 6,656,000,000 (26 * 256 * 1,000,000) characters being searched and both returned what's essentially the same value. Coincidence? I think not.

Running this on a machine and getting an answer that varies significantly between the two tests on that CPU would indicate a difference in run-time performance of the two syntactically different ways of specifying the same thing. I seriously doubt that will happen.


EDIT:

Running it multiple times shows the randomness in action. I adjusted the code a bit to make it do five loops across the benchmarks this morning. The system was scanning the disk while running the tests so they took a little longer, but they still show minor random differences between the two runs:

require 'benchmark'

str = (('a'..'z').to_a * 256).join + 'foo'
n = 1_000_000

puts RUBY_VERSION, n
puts

regex = 'foo'
Benchmark.bm(2) do |b|
5.times do
b.report('%r') { n.times { str[%r/#{ regex }/] } }
b.report('/') { n.times { str[/#{ regex }/] } }
end
end

And the results:

      # user     system      total        real
%r 12.440000 0.030000 12.470000 ( 12.475312)
/ 12.420000 0.030000 12.450000 ( 12.455737)
%r 12.400000 0.020000 12.420000 ( 12.431750)
/ 12.400000 0.020000 12.420000 ( 12.417107)
%r 12.430000 0.030000 12.460000 ( 12.467275)
/ 12.390000 0.020000 12.410000 ( 12.418452)
%r 12.400000 0.030000 12.430000 ( 12.432781)
/ 12.390000 0.020000 12.410000 ( 12.412609)
%r 12.410000 0.020000 12.430000 ( 12.427783)
/ 12.420000 0.020000 12.440000 ( 12.449336)

Running about two seconds later:

      # user     system      total        real
%r 12.360000 0.020000 12.380000 ( 12.390146)
/ 12.370000 0.030000 12.400000 ( 12.391151)
%r 12.370000 0.020000 12.390000 ( 12.397819)
/ 12.380000 0.020000 12.400000 ( 12.399413)
%r 12.410000 0.020000 12.430000 ( 12.440236)
/ 12.420000 0.030000 12.450000 ( 12.438158)
%r 12.560000 0.040000 12.600000 ( 12.969364)
/ 12.640000 0.050000 12.690000 ( 12.810051)
%r 13.160000 0.120000 13.280000 ( 14.624694) # <-- opened new browser window
/ 12.650000 0.040000 12.690000 ( 13.040637)

There is no consistent difference in speed.

What are `\.` and `i` in a regular expression?

%r{} is used for regular expressions.

\. is looking for the literal character .. You need the \ to escape because just using . means something else entirely (any character matches).

i is used for case insensitive searches.

Essentially, your regex is matching for anything that ends in .gif, .jpg or .png. These could also be something like .GiF because of the case insensitive search.

What does an i at the end of a regular expression mean?

i modifier means regex will ignore case when matching text. You can read more about other regex modifiers here.

# with i modifier
%r{.(gif|jpg|png)$}i === ".JpG" #=> true
%r{.(gif|jpg|png)$}i === ".jpg" #=> true

# without i modifier
%r{.(gif|jpg|png)$} === ".JpG" #=> false
%r{.(gif|jpg|png)$} === ".jpg" #=> true

Note: . in your regex means 'any single character except newline', not 'dot character'. If you need to match dot character, use backslash to escape it: \.

%r{.(gif|jpg|png)$} === "ajpg"  # => true
%r{\.(gif|jpg|png)$} === "ajpg" # => false
%r{\.(gif|jpg|png)$} === ".jpg" # => true

Ruby Regular Expression: 1-2 digits required after period if period is present

EDITED

I think it will resolve..

/^\d{1,}(\.\d{1,2}){0,1}$/

My test case:

2.3.0 :129 > regex = /^\d{1,}(\.\d{1,2}){0,1}$/
=> /^\d{1,}(\.\d{1,2}){0,1}$/
2.3.0 :161 > regex.match("1.9")
=> #<MatchData "1.9" 1:".9">
2.3.0 :162 > regex.match("1")
=> #<MatchData "1" 1:nil>
2.3.0 :163 > regex.match("12")
=> #<MatchData "12" 1:nil>
2.3.0 :164 > regex.match("1211.1")
=> #<MatchData "1211.1" 1:".1">
2.3.0 :165 > regex.match("121234.14")
=> #<MatchData "121234.14" 1:".14">
2.3.0 :166 > regex.match("z")
=> nil
2.3.0 :167 > regex.match("z1")
=> nil
2.3.0 :168 > regex.match("z.5")
=> nil
2.3.0 :169 > regex.match("z.55")
=> nil
2.3.0 :170 > regex.match(" .9")
=> nil
2.3.0 :171 > regex.match("34.")
=> nil
2.3.0 :172 > regex.match("4..3")
=> nil
2.3.0 :173 > regex.match("4..55")
=> nil
2.3.0 :174 > regex.match("4.333")
=> nil
2.3.0 :175 > regex.match("111,222.44")
=> nil


Related Topics



Leave a reply



Submit