Regex, How to Match Multiple Lines

How do I match any character across multiple lines in a regular expression?

It depends on the language, but there should be a modifier that you can add to the regex pattern. In PHP it is:

/(.*)<FooBar>/s

The s at the end causes the dot to match all characters including newlines.

regex: match multiple lines until a line contains

You need a positive lookahead.

foo\.[ab][\s\S]*?(?=\n.*?=|$)
  • [\s\S]*? matches lazily any character
  • (?=\n.*?=|$) until a newline containing an = is ahead or $ end.

See demo at regex101

Regex matching over multiple lines

Your tries were pretty close. In the first one you probably need to set the flag that allows the . to match line feeds. It normally doesn't. In your second, you need to set the non-greedy ? mode on the anything match .*. Otherwise .* tries to match the entire rest of the text.

It would be something like this. /^ <br>\n\d+\s[a-zA-Z"“](.*?\n)*?<hr\/>/

But anyway, this is something that is best done in Perl. Perl is where all the advanced regex comes from.

use strict;
use diagnostics;

our $text =<<EOF;
The figure that now stood by its bows was tall and swart, with one white tooth <br>
evilly protruding from its steel-like lips. <br>
<br>
1 "Hardly" had they pulled out from under the ship’s lee, when a <br>
fourth keel, coming from the windward side, pulled round under the stern, <br>
and showed the five strangers <br>
127 <br>
<br>
<hr/>
More text.
EOF

our $regex = qr{^ <br>\n\d+ +[A-Z"“].*?<hr/>}ism;
$text =~ s/($regex)/<!-- Removed -->/;
print "Removed text:\n[$1]\n\n";
print "New text:\n[$text]\n";

That prints:

Removed text:
[ <br>
1 "Hardly" had they pulled out from under the ship’s lee, when a <br>
fourth keel, coming from the windward side, pulled round under the stern, <br>
and showed the five strangers <br>
127 <br>
<br>
<hr/>]

New text:
[The figure that now stood by its bows was tall and swart, with one white tooth <br>
evilly protruding from its steel-like lips. <br>
<!-- Removed -->
More text.
]

The qr operator builds a regular expression so that it can be stored in a variable. The ^ at the beginning means to anchor this match at the beginning of a line. The ism on the end stands for case insensitive, single string, multiple embedded lines. s allows . to match line feeds. m allows ^ to match at the beginning of lines embedded in the string. You would add a g flag to end of the substitution to do a global replacement. s///g

The Perl regex documentation explains everything.
https://perldoc.perl.org/perlretut

See also Multiline replace in perl with extended expressions not working.

HTH

regex to match terms in multiple lines

You can use this regex to match across the lines in Javascript:

/^(?=[^]*term1)(?=[^]*term2)(?![^]*term3)[^]*$/

In JS, [^] matches any character including new line.

RegEx Demo

If not using JS or want to make this regex portable to other flavors then one can use:

/^(?=[\D\d]*term1)(?=[\D\d]*term2)(?![\D\d]*term3)[\D\d]*$/

Matching across multiple lines regular expression

You need to use something like

^0[\s\S]*?[\n\r]Unique:

and replace with Unique:.

  • ^ - start of a line
  • 0 - a literal 0
  • [\s\S]*? - zero or more characters incl. a newline as few as possible
  • [\n\r] - a linebreak symbol
  • Unique: - a whole word Unique:

Another possible regex is:

^0[^\r]*(?:\r(?!Unique:)[^\r]*)*

where \r is the line endings in the current file. Replace with an empty string.

Note that you could also use (?m)^0.*?[\r\n]Unique: regex (to replace with Unique:) with the (?m) option:

m: multi-line (dot(.) match newline)

Regex matching pattern in multiple lines without specific word in the match

You might use

^PAT_A[^;\n]*(?:\n(?![^\n;]*NOT_MATCH_THIS)[^;\n]*)*\n[^;\n]*PAT_B[^;]*;

In parts, the pattern matches:

  • ^ Start of string
  • PAT_A Match literally
  • [^;\n]* Optionally match any char except ; or a newline
  • (?: Non capture group (to repeat as a whole)
    • \n(?![^\n;]*NOT_MATCH_THIS) Match a newline, and assert that the string does not contain NOT_MATCH_THIS and does not contain a ; or a newline to stay on the same line
    • [^;\n]* If the previous assertion is true, match the whole line (no containing a ;)
  • )* Close the non capture group, and optionally repeat matching all lines
  • \n[^;\n]* Match a newline, and any char except ; or a newline
  • PAT_B[^;]*; Then match PAT_B followed by any char except ; followed by matching the ;

Regex demo

Regex to select multiple lines

Use (?s) DOTALL modifier to make dot to match newline characters.

(?s)Quick.*?over.*?dog

OR

Add word boundary \b if necessary. \b matches between a word character and a non-word character.

(?s)\bQuick\b.*?\bover\b.*?\bdog\b

OR

If you're running javascript, [\s\S]*? matches any character including line breaks. Note that there isn't a dotall modifier s in js.

\bQuick\b[\s\S]*?\bover\b[\s\S]*?\bdog\b

DEMO

Regex code not collecting multiple lines of matching pattern

I couldn't help but respond to this as I am familiar with both regex and guitar haha.

For your short regex, please see the following regex on regex101.com:
https://regex101.com/r/NqGhoh/1/

The multiline modifier is required.

The main problem with this is that you are handling newlines on the front and back of the expression. I have modified the expression in a couple ways:

  • Made the regex match newlines only on the end, always looking for a ^ at the beginning.
  • Matching the carriage return new line combination as \r?\n as a carriage return should always be followed by a newline when it is used.
  • Used non-capturing groups to improve overhead and reduce complexity when looking at matches. This is the ?: just inside the parenthesis. It means the group won't be captured in the result, just used for encapsulation.

I started testing your longer regex and may update that as well, though it sounds like you already know what to do with the shorter one corrected.

Regex - Match multiple lines that don't end with character

This PCRE expression should deliver the required result:

/^.*?(?<! _)$/gms

This is using the negative lookbehind (?<! _) in combination with the multiline flag (m) to match up to a line end that is not preceded by _. The single-line flag (s) ensures that the dot also matches newlines.

Here's a regex101 example.

javascript regex to match multiple lines

JavaScript lacks the s (singleline/dotall) regex option, but you can workaround it by replacing . with [\s\S] (match any character that is a whitespace or that is not a whitespace, which basically means match everything). Also, make your quantifier lazy and get rid of the spaces in the pattern, since there's also no x (extended) option in JS:

var regex = /###([\s\S]*?)###/;

Example: