Does Ruby Support Conditional Regular Expressions

struggling with conditional regular expressions

You're looking for ^ID: (\d+)(?: Status: (\d+))?$

edit:
Since the question is tagged Ruby it's worth mentioning that according to both
this question and this flavour-comparison, Ruby doesn't do conditional regex.

http://www.regular-expressions.info is a great source on the subject.

Does Ruby support conditional regular expressions

No, ruby does not support that (neither in 1.8 nor 1.9).

Ruby regular expression for conditional expression

You forgot the space after the comma.
Try /^constraint :(\w*), '(.*)'$/

Although to be more general, I'd go with this: /^constraint :([^,]+),\s*'(.*)'$/.

Conditional regex in Ruby

Your regex (?=.*(USD))(?(1)\d+|[a-zA-Z]) does not work because

  • (?=.*(USD)) - a positive lookahead, triggered at every location inside a string (if scan is used) that matches USD substring after any 0 or more chars other than line break chars as many as possible (it means, there will only be a match if there is USD somewhere on a line)
  • (?(1)\d+|[a-zA-Z]) - a conditional construct that matches 1+ digits if Group 1 matched (if there is USD), or, an ASCII letter will be tried. However, the second alternative pattern will never be tried, because you required USD to be present in the string for a match to occur.

Look at the USD 100 regex debugger, it shows exactly what happens when the (?=.*(USD))(?(1)\d+|[a-zA-Z]) regex tries to find a match:

  • Step 1 to 22: The lookahead pattern is tried first. The point here is that the match will fail immediately if the positive lookahead pattern does not find a match. In this case, USD is found at the start of the string (since the first time the pattern is tried, the regex index is at the string start position). The lookahead found a match.
  • Step 23-25: since a lookahead is a non-consuming pattern, the regex index is still at the string start position. The lookahead says "go-ahead", and the conditional construct is entered. (?(1) condition is met, Group 1, USD, was matched. So, the first, then, part is triggered. \d+ does not find any digits, since there is U letter at the start. The regex match fails at the string start position, but there are more positions in the string to test since there is no \A nor ^ anchor that would only let a match to occur if the match is found at the start of the string/line.
  • Step 26: The regex engine index is advanced one char to the right, now, it is right before the letter S.
  • Step 27-40: The regex engine wants to find 0+ chars and then USD immediately to the right of the current location, but fails (U is already "behind" the index).
  • Then, the execution is just the same as described above: the regex fails to match USD anywhere to the right of the current location and eventually fails.

If the USD is somewhere to the right of 100, then you'd get a match.

So, the lookahead does not set any search range, it simply allows matching the rest of the patterns (if its pattern matches) or not (if its pattern is not found).

You may use

.scan(/^USD.*?\K(\d+)|([a-zA-Z])/).flatten.compact

Pattern details

  • ^USD.*?\K(\d+) - either USD at the start of the string, then any 0 or more chars other than line break chars as few as possible, and then the text matched is dropped and 1+ digits are captured into Group 1
  • | - or
  • ([a-zA-Z]) - any ASCII letter captured into Group 2.

See Ruby demo:

p "USD 100".scan(/^USD.*?\K(\d+)|([a-zA-Z])/).flatten.compact
# => ["100"]
p "YEN 100".scan(/^USD.*?\K(\d+)|([a-zA-Z])/).flatten.compact
# => ["Y", "E", "N"]

Using regex in Ruby if condition

if params[:test] =~ /foo/
# Successful match
else
# Match attempt failed
end

Works for me. Debug what is in params[:test]

How do you read this ternary condition in Ruby?

A (slightly) less confusing way to write this is:

str.split(/',\s*'/).map do |match|
if match[0] == ?,
match
else
"some string"
end
end.join

I think multiline ternary statements are horrible, especially since if blocks can return in Ruby.

Probably the most confusing thing here is the ?, which is a character literal. In Ruby 1.8 this means the ASCII value of the character (in this case 44), in Ruby 1.9 this is just a string (in this case ",").

The reason for using a character literal instead of just "," is that the return value of calling [] on a string changed in Ruby 1.9. In 1.8 it returned the ASCII value of the character at that position, in 1.9 it returns a single-character string. Using ?, here avoids having to worry about the differences in String#[] between Ruby 1.8 & 1.9.

Ultimately the conditional is just checking if the first character in match is ,, and if so it keeps the value the same, else it sets it to "some string".

Ruby: Refactoring conditional statements containing regular expressions

For the record, here is how I ended up refactoring it.

Each 'subject' of handlers gets its own module and file:

module RequestHandlers
module Customers

def Customers.included(mock_server)
mock_server.add_handler 'post /customers', :new_customer
mock_server.add_handler 'post /customers/(.*)', :update_customer
end

def new_customer(route, method_url, params)
# create new customer
end

def update_customer(route, method_url, params)
# update existing customer
end

end
end

The main class is where the handlers live and get used:

class MockServer

# Handlers are ordered by priority
@@handlers = []

include RequestHandlers::Customers
include RequestHandlers::Items
# etc.

def self.add_handler(route, name)
@@handlers << {
:route => %r{^#{route}$},
:name => name
}
end

def mock_request(method, url, params={})
method_url = "#{method} #{url}"
handler = @@handlers.find {|h| method_url =~ h[:route] }

if handler
self.send(handler[:name], handler[:route], method_url, params)
else
throw 'Unrecognized request'
end
end
end

ruby regex for multiple words conditionally match

Assuming the prefixes are only Prof., Dr., Mr., Mrs., Prin., Ms. you can try:

s = "Prof. Dr. John Doe"
s.gsub(/Prof.|Dr.|Mr.|Mrs.|Prin.|Ms./, "").strip

For second question(want to store removed prefixes in another string)

 s = "Prof. Dr. John Doe"
s.scan(/Prof.|Dr.|Mr.|Mrs.|Prin.|Ms./).join("")
=> "Prof.Dr."


Related Topics



Leave a reply



Submit