Ruby Regex to Capture Everything Between Two Strings (Inclusive)

Ruby Regex to capture everything between two strings (inclusive)

I believe you're looking for an non-greedy regex, like this:

/<div class="the_class">(.*?)<\/div>/m

Note the added ?. Now, the capturing group will capture as little as possible (non-greedy), instead of as most as possible (greedy).

Regex Match all characters between two strings

For example

(?<=This is)(.*)(?=sentence)

Regexr

I used lookbehind (?<=) and look ahead (?=) so that "This is" and "sentence" is not included in the match, but this is up to your use case, you can also simply write This is(.*)sentence.

The important thing here is that you activate the "dotall" mode of your regex engine, so that the . is matching the newline. But how you do this depends on your regex engine.

The next thing is if you use .* or .*?. The first one is greedy and will match till the last "sentence" in your string, the second one is lazy and will match till the next "sentence" in your string.

Update

Regexr

This is(?s)(.*)sentence

Where the (?s) turns on the dotall modifier, making the . matching the newline characters.

Update 2:

(?<=is \()(.*?)(?=\s*\))

is matching your example "This is (a simple) sentence". See here on Regexr

RegEx to capture everything between two strings but avoid capturing commas

As mentioned in comments, a regular expression can't alter the text that was matched, it just matches something or not.

If you're willing to stop the match at the first comma, rather than including all the rest with the commas removed, you can use this:

(?<=<title\>)(.*?)(?=(,|\s*<\/title>))

https://regex101.com/r/PPb1ba/1

Ruby Regular expression to extract string between first ( and last )

A greedy expression with a capture, e.g.:

/\((.*)\)/

ruby print selected lines of text in between 2 strings

Don't do it line by line, just slurp the whole thing into a string and rip it apart:

s    = File.read('index.html')
want = s.match(/<!-- begin posts -->(.*)<!-- end posts -->/m)[1]

And now everything between your markers is in want. Don't forget the m modifier on the regex.

While you're mangling your input you can strip out the stray leading and trailing whitespace too:

want = s.match(/<!-- begin posts -->(.*)<!-- end posts -->/m)[1].strip

As Tudor notes below, you might want to use a non-greedy (.*?) for the group if you think there is any chance of multiple <!-- end posts --> markers; doesn't hurt to be a little paranoid when they really are you to get you.

References:

  • File.read (actually IO.read)
  • String#match
  • String#strip

UPDATE: the match method on a string returns a MatchData object. The array access operator:

... mtch[0] is equivalent to the special variable $&, and returns the entire matched string. mtch[1], mtch[2], and so on return the values of the matched backreferences (portions of the pattern between parentheses).

Is used to access the matching parts. There's only one group in the regex so [1] gets you the contents of that group without the surrounding HTML comment delimiters.

How do I match any character across multiple lines in a regular expression?

It depends on the language, but there should be a modifier that you can add to the regex pattern. In PHP it is:

/(.*)<FooBar>/s

The s at the end causes the dot to match all characters including newlines.

using exclusive ranges with regular expressions Ruby

Using String#sub with capturing group:

"square".sub(/(.*qu)(.*)/, '\2\1ay')
# => "aresquay"

Ruby Match - escape a string with special characters

You probably want Regexp.escape:

service = properties.match(/^com\.google\.(#{Regexp.escape(serviceName)})\.public$/)

Additionally, you had surrounded your inclusion of serviceName with a [...]+, which means more than one character from this list of characters in [...].

E.g. This regexp [commonapi]+ accepts moconaipimdconn, or indeed any length string that contained some characters from the service name you actually wanted to capture.



Related Topics



Leave a reply



Submit