how to use one line regular expression to get matched content
You need the Regexp#match
method. If you write /\[(.*?)\](.*)/.match('[ruby] regex')
, this will return a MatchData
object. If we call that object matches
, then, among other things:
matches[0]
returns the whole matched string.matches[n]
returns the nth capturing group ($n
).matches.to_a
returns an array consisting ofmatches[0]
throughmatches[N]
.matches.captures
returns an array consisting of just the capturing group (matches[1]
throughmatches[N]
).matches.pre_match
returns everything before the matched string.matches.post_match
returns everything after the matched string.
There are more methods, which correspond to other special variables, etc.; you can check MatchData
's docs for more. Thus, in this specific case, all you need to write is
tag, keyword = /\[(.*?)\](.*)/.match('[ruby] regex').captures
Edit 1: Alright, for your harder task, you're going to instead want the String#scan
method, which @Theo used; however, we're going to use a different regex. The following code should work:
# You could inline the regex, but comments would probably be nice.
tag_and_text = / \[([^\]]*)\] # Match a bracket-delimited tag,
\s* # ignore spaces,
([^\[]*) /x # and match non-tag search text.
input = '[ruby] [regex] [rails] one line [foo] [bar] baz'
tags, texts = input.scan(tag_and_text).transpose
The input.scan(tag_and_text)
will return a list of tag–search-text pairs:
[ ["ruby", ""], ["regex", ""], ["rails", "one line "]
, ["foo", ""], ["bar", "baz"] ]
The transpose
call flips that, so that you have a pair consisting of a tag list and a search-text list:
[["ruby", "regex", "rails", "foo", "bar"], ["", "", "one line ", "", "baz"]]
You can then do whatever you want with the results. I might suggest, for instance
search_str = texts.join(' ').strip.gsub(/\s+/, ' ')
This will concatenate the search snippets with single spaces, get rid of leading and trailing whitespace, and replace runs of multiple spaces with a single space.
Regular expression to match a line that doesn't contain a word
The notion that regex doesn't support inverse matching is not entirely true. You can mimic this behavior by using negative look-arounds:
^((?!hede).)*$
Non-capturing variant:
^(?:(?!:hede).)*$
The regex above will match any string, or line without a line break, not containing the (sub)string 'hede'. As mentioned, this is not something regex is "good" at (or should do), but still, it is possible.
And if you need to match line break chars as well, use the DOT-ALL modifier (the trailing s
in the following pattern):
/^((?!hede).)*$/s
or use it inline:
/(?s)^((?!hede).)*$/
(where the /.../
are the regex delimiters, i.e., not part of the pattern)
If the DOT-ALL modifier is not available, you can mimic the same behavior with the character class [\s\S]
:
/^((?!hede)[\s\S])*$/
Explanation
A string is just a list of n
characters. Before, and after each character, there's an empty string. So a list of n
characters will have n+1
empty strings. Consider the string "ABhedeCD"
:
┌──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┐
S = │e1│ A │e2│ B │e3│ h │e4│ e │e5│ d │e6│ e │e7│ C │e8│ D │e9│
└──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┘
index 0 1 2 3 4 5 6 7
where the e
's are the empty strings. The regex (?!hede).
looks ahead to see if there's no substring "hede"
to be seen, and if that is the case (so something else is seen), then the .
(dot) will match any character except a line break. Look-arounds are also called zero-width-assertions because they don't consume any characters. They only assert/validate something.
So, in my example, every empty string is first validated to see if there's no "hede"
up ahead, before a character is consumed by the .
(dot). The regex (?!hede).
will do that only once, so it is wrapped in a group, and repeated zero or more times: ((?!hede).)*
. Finally, the start- and end-of-input are anchored to make sure the entire input is consumed: ^((?!hede).)*$
As you can see, the input "ABhedeCD"
will fail because on e3
, the regex (?!hede)
fails (there is "hede"
up ahead!).
How to match regex pattern on single line only?
Remove the s
or DOTALL
flag and change your regex to the following:
^.*?((\yo\b.*?(cut me:)[\s\S]*))
With the DOTALL
flag enabled .
will match newline characters, so your match can span multiple lines including lines before yo
or between yo
and cut me
. By removing this flag you can ensure that you only match the line with both yo
and cut me
, and then change the .*
at the end to [\s\S]*
which will match any character including newlines so that you can match to the end of the string.
http://regex101.com/r/sX2kL0
edit: Note that this takes a slightly different approach than the other answer, this will match the portion of the string that you want deleted so you can replace this portion with an empty string to remove it.
Regex to match only the first line?
that's sounds more like a job for the filehandle buffer.
You should be able to match the first line with:
/^(.*)$/m
(as always, this is PCRE syntax)
the /m
modifier makes ^
and $
match embedded newlines. Since there's no /g
modifier, it will just process the first occurrence, which is the first line, and then stop.
If you're using a shell, use:
head -n1 file
or as a filter:
commandmakingoutput | head -n1
Please clarify your question, in case this is not wat you're looking for.
how to use regular expression to match strings followed by some keyword and multiple lines
You can use this regex,
^(?s).*keyword1.*?(keyword2 yyyy).*$
Explanation:
- ^ --> start of string
- (?s) --> Enables dot to match new lines
- .* keyword1.*? --> Matches a string that contains keyword1 preceded and succeeded by any characters doing non-greedy match
- (keyword2 yyyy) --> matches the string of your interest
- .*$ --> followed by any characters and finally end of input
Demo
How do I match any character across multiple lines in a regular expression?
It depends on the language, but there should be a modifier that you can add to the regex pattern. In PHP it is:
/(.*)<FooBar>/s
The s at the end causes the dot to match all characters including newlines.
Regex match line with string AND without another string
Try: (?=^.*await)(?!^.+ConfigureAwait).+
Explanation:
(?=^.*await)
- positive lookahead: assert what is following is: ^
beginning of a line, followed by one or more of any characters due to .+
and a word await
, concisely: assert that there is await
in a line
(?!^.+ConfigureAwait)
- negative lookahead: similairly to above, but negated :) assert that following line doesn't contain ConfigureAwait
.+
- match one ore more of any character (except new line)
Demo
Related Topics
Rails: Serializing Objects in a Database
Using a Ruby Script to Login to a Website via Https
Ruby Koans: Why Convert List of Symbols to Strings
Array.Include? Multiple Values
Rails How to Switch Between Dev and Production Mode
What Are the Main Differences Between Sinatra and Ramaze
Creating an Empty File in Ruby: "Touch" Equivalent
"Ago" Date/Time Functions in Ruby/Rails
Which Ruby Memoize Pattern Does Activesupport::Memoizable Refer To
How to Check from Ruby Whether a Process with a Certain Pid Is Running
How to Sort a Hash by Value in Descending Order and Output a Hash in Ruby
An Error Occurred While Installing Debugger-Linecache (1.1.1), and Bundler Cannot Continue
Ruby Singleton Methods with (Class_Eval, Define_Method) VS (Instance_Eval, Define_Method)
Paperclip :Style Depending on Model (Has_Many Polymorphic Images)