struggling with conditional regular expressions
You're looking for ^ID: (\d+)(?: Status: (\d+))?$
edit:
Since the question is tagged Ruby it's worth mentioning that according to both
this question and this flavour-comparison, Ruby doesn't do conditional regex.
http://www.regular-expressions.info is a great source on the subject.
Does Ruby support conditional regular expressions
No, ruby does not support that (neither in 1.8 nor 1.9).
Ruby regular expression for conditional expression
You forgot the space after the comma.
Try /^constraint :(\w*), '(.*)'$/
Although to be more general, I'd go with this: /^constraint :([^,]+),\s*'(.*)'$/
.
Conditional regex in Ruby
Your regex (?=.*(USD))(?(1)\d+|[a-zA-Z])
does not work because
(?=.*(USD))
- a positive lookahead, triggered at every location inside a string (ifscan
is used) that matchesUSD
substring after any 0 or more chars other than line break chars as many as possible (it means, there will only be a match if there isUSD
somewhere on a line)(?(1)\d+|[a-zA-Z])
- a conditional construct that matches 1+ digits if Group 1 matched (if there isUSD
), or, an ASCII letter will be tried. However, the second alternative pattern will never be tried, because you requiredUSD
to be present in the string for a match to occur.
Look at the USD 100
regex debugger, it shows exactly what happens when the (?=.*(USD))(?(1)\d+|[a-zA-Z])
regex tries to find a match:
- Step 1 to 22: The lookahead pattern is tried first. The point here is that the match will fail immediately if the positive lookahead pattern does not find a match. In this case,
USD
is found at the start of the string (since the first time the pattern is tried, the regex index is at the string start position). The lookahead found a match. - Step 23-25: since a lookahead is a non-consuming pattern, the regex index is still at the string start position. The lookahead says "go-ahead", and the conditional construct is entered.
(?(1)
condition is met, Group 1,USD
, was matched. So, the first,then
, part is triggered.\d+
does not find any digits, since there isU
letter at the start. The regex match fails at the string start position, but there are more positions in the string to test since there is no\A
nor^
anchor that would only let a match to occur if the match is found at the start of the string/line. - Step 26: The regex engine index is advanced one char to the right, now, it is right before the letter
S
. - Step 27-40: The regex engine wants to find 0+ chars and then
USD
immediately to the right of the current location, but fails (U
is already "behind" the index). - Then, the execution is just the same as described above: the regex fails to match
USD
anywhere to the right of the current location and eventually fails.
If the USD
is somewhere to the right of 100
, then you'd get a match.
So, the lookahead does not set any search range, it simply allows matching the rest of the patterns (if its pattern matches) or not (if its pattern is not found).
You may use
.scan(/^USD.*?\K(\d+)|([a-zA-Z])/).flatten.compact
Pattern details
^USD.*?\K(\d+)
- eitherUSD
at the start of the string, then any 0 or more chars other than line break chars as few as possible, and then the text matched is dropped and 1+ digits are captured into Group 1|
- or([a-zA-Z])
- any ASCII letter captured into Group 2.
See Ruby demo:
p "USD 100".scan(/^USD.*?\K(\d+)|([a-zA-Z])/).flatten.compact
# => ["100"]
p "YEN 100".scan(/^USD.*?\K(\d+)|([a-zA-Z])/).flatten.compact
# => ["Y", "E", "N"]
Using regex in Ruby if condition
if params[:test] =~ /foo/
# Successful match
else
# Match attempt failed
end
Works for me. Debug what is in params[:test]
How do you read this ternary condition in Ruby?
A (slightly) less confusing way to write this is:
str.split(/',\s*'/).map do |match|
if match[0] == ?,
match
else
"some string"
end
end.join
I think multiline ternary statements are horrible, especially since if
blocks can return in Ruby.
Probably the most confusing thing here is the ?,
which is a character literal. In Ruby 1.8 this means the ASCII value of the character (in this case 44
), in Ruby 1.9 this is just a string (in this case ","
).
The reason for using a character literal instead of just ","
is that the return value of calling []
on a string changed in Ruby 1.9. In 1.8 it returned the ASCII value of the character at that position, in 1.9 it returns a single-character string. Using ?,
here avoids having to worry about the differences in String#[]
between Ruby 1.8 & 1.9.
Ultimately the conditional is just checking if the first character in match
is ,
, and if so it keeps the value the same, else it sets it to "some string"
.
Ruby: Refactoring conditional statements containing regular expressions
For the record, here is how I ended up refactoring it.
Each 'subject' of handlers gets its own module and file:
module RequestHandlers
module Customers
def Customers.included(mock_server)
mock_server.add_handler 'post /customers', :new_customer
mock_server.add_handler 'post /customers/(.*)', :update_customer
end
def new_customer(route, method_url, params)
# create new customer
end
def update_customer(route, method_url, params)
# update existing customer
end
end
end
The main class is where the handlers live and get used:
class MockServer
# Handlers are ordered by priority
@@handlers = []
include RequestHandlers::Customers
include RequestHandlers::Items
# etc.
def self.add_handler(route, name)
@@handlers << {
:route => %r{^#{route}$},
:name => name
}
end
def mock_request(method, url, params={})
method_url = "#{method} #{url}"
handler = @@handlers.find {|h| method_url =~ h[:route] }
if handler
self.send(handler[:name], handler[:route], method_url, params)
else
throw 'Unrecognized request'
end
end
end
ruby regex for multiple words conditionally match
Assuming the prefixes are only Prof.
, Dr.
, Mr.
, Mrs.
, Prin.
, Ms.
you can try:
s = "Prof. Dr. John Doe"
s.gsub(/Prof.|Dr.|Mr.|Mrs.|Prin.|Ms./, "").strip
For second question(want to store removed prefixes in another string)
s = "Prof. Dr. John Doe"
s.scan(/Prof.|Dr.|Mr.|Mrs.|Prin.|Ms./).join("")
=> "Prof.Dr."
Related Topics
Reversing 'One-Hot' Encoding in Pandas
How to Install Another Version of Python to Virtualenv
Convert a Timedelta to Days, Hours and Minutes
Alternative Way to Split a List into Groups of N
Creating a List of Dictionaries Results in a List of Copies of the Same Dictionary
How to Define a Function with Optional Arguments
How to Split a Multi-Line String into Multiple Lines
Getting Number of Elements in an Iterator in Python
Quick Way to Upsample Numpy Array by Nearest Neighbor Tiling
How to Search Sub-Folders Using Glob.Glob Module
Data Scraping from Published Power Bi Visual
How to Put Parameterized SQL Query into Variable and Then Execute in Python
Django: Redirect to Previous Page After Login
Popen Waiting for Child Process Even When the Immediate Child Has Terminated
Python Sockets Error Typeerror: a Bytes-Like Object Is Required, Not 'Str' with Send Function