Ruby Gsub/Regex Modifiers

Ruby gsub / regex modifiers?

Zenspider's Quickref contains a section explaining which escape sequences can be used in regexen and one listing the pseudo variables that get set by a regexp match. In the second argument to gsub you simply write the name of the variable with a backslash instead of a $ and it will be replaced with the value of that variable after applying the regexp. If you use a double quoted string, you need to use two backslashes.

When using the block-form of gsub you can simply use the variables directly. If you return a string containing e.g. \1 from the block, that will not be replaced with $1. That only happens when using the two-argument form.

What would /i at the end of gsub regex mean?

The i modifier is used to perform case-insensitive matching. By using this modifier, letters in the pattern match both upper and lower case. Be sure to check out the Regexp documentation.

Ruby non-greedy modifier did not apply?

Regex matches from left to right. Your regex ,.+?="",? matches the first comma in the string a,b="cde",f="",g="hi",j="", the one between a and b. Then it tries to find ="" that exists after the ,g so you get the actual result.

What you want is: ,[^=]+?="",? that matches 1 or more any character that is not an equal sign before ="" and you'll get a,b="cde",g="hi" as result.

What would /i at the end of gsub regex mean?

The i modifier is used to perform case-insensitive matching. By using this modifier, letters in the pattern match both upper and lower case. Be sure to check out the Regexp documentation.

How do I gsub this string using regex?

You have put your regular expression inside a string, which obviously won't work.

>> t2 = t1.collect{|n| n.gsub(/^name.*$/, "")}
=> ["\n"]

If you also want to get rid of the newline, use the m regex modifier.

>> t2 = t1.collect{|n| n.gsub(/^name.*$/m, "")}
=> [""]

How do I match the following Ruby gsub in Python's re?

You should know that Matz has chosen to confuse everybody by renaming the /s modifier (SINGLELINE or DOTALL) as it's used in all other regex flavors to Ruby's /m (MULTILINE) modifier (which deals with whether newlines are treated as "any character" or not by the . token).

Conversely, what other flavors do call the /m or MULTILINE modifier (which determines whether ^ and $ match at the start/end of lines instead of just the start/end of the entire string) doesn't exist at all in Ruby. Those anchors always match at the start/end of lines.

So, to translate your code from Ruby into Python, you need to do

def reindent(line, numIndent):
return re.sub(r'(.)^', r'\1' + ' ' * numIndent, line, flags=re.DOTALL|re.MULTILINE)

If your goal is to indent all lines but the first one (which is what this is doing), you can simplify the regex:

def reindent(line, numIndent):
return re.sub(r'(?<!\A)^', ' ' * numIndent, line, flags=re.MULTILINE)

Result:

>>> s = "The following lines\nare indented,\naren't they?"
>>> print(reindent(s,1))
The following lines
are indented,
aren't they?

Ruby gsub issues

By default, the . does not match line break chars. If you enable the m modifier in Ruby (in other languages, this is the s modifier) it should work:

str.gsub!(/==EX.*?==EXCLUDE/m, '')

Here's a live demo on Rubular: http://rubular.com/r/YxLSB1Iq95

How do I use gsub to search and replace using a regex?

You can do it like this:

a.gsub(/#(\S+)/, '<a href="/tags/\1">\0</a>')

The reason why your replacement doesn't work is that you must use double escape when you are between double quotes:

a.gsub(/#(\S+)/, "<a href='/tags/\\1'>\\0</a>")

Note that the /i modifier is not needed here.



Related Topics



Leave a reply



Submit