How to Split a Long Regular Expression into Multiple Lines in JavaScript

How to split a long regular expression into multiple lines in JavaScript?

[Edit 2022/08] Created a small github repository to create regular expressions with spaces, comments and templating.


You could convert it to a string and create the expression by calling new RegExp():

var myRE = new RegExp (['^(([^<>()[\]\\.,;:\\s@\"]+(\\.[^<>(),[\]\\.,;:\\s@\"]+)*)',
'|(\\".+\\"))@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.',
'[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\\.)+',
'[a-zA-Z]{2,}))$'].join(''));

Notes:

  1. when converting the expression literal to a string you need to escape all backslashes as backslashes are consumed when evaluating a string literal. (See Kayo's comment for more detail.)

  2. RegExp accepts modifiers as a second parameter

    /regex/g => new RegExp('regex', 'g')

[Addition ES20xx (tagged template)]

In ES20xx you can use tagged templates. See the snippet.

Note:

  • Disadvantage here is that you can't use plain whitespace in the regular expression string (always use \s, \s+, \s{1,x}, \t, \n etc).

(() => {
const createRegExp = (str, opts) =>
new RegExp(str.raw[0].replace(/\s/gm, ""), opts || "");
const yourRE = createRegExp`
^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|
(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|
(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$`;
console.log(yourRE);
const anotherLongRE = createRegExp`
(\byyyy\b)|(\bm\b)|(\bd\b)|(\bh\b)|(\bmi\b)|(\bs\b)|(\bms\b)|
(\bwd\b)|(\bmm\b)|(\bdd\b)|(\bhh\b)|(\bMI\b)|(\bS\b)|(\bMS\b)|
(\bM\b)|(\bMM\b)|(\bdow\b)|(\bDOW\b)
${"gi"}`;
console.log(anotherLongRE);
})();

JavaScript Regex formatted on multiple lines

JavaScript doesn't have support for the /x flag (which allows spaces and comments), but you can split your expression over multiple lines using strings. For example:

var re = new RegExp(
"foo" +
"bar" +
"baz",
"ig");

Of course this has the disadvantage of regular string quoting of regex (double escapes).

You could also use XRegExp with its (?x) flag, but you are still limited by the regular JS string quoting.

How to split a long regular expression into multiple lines in JavaScript?

[Edit 2022/08] Created a small github repository to create regular expressions with spaces, comments and templating.


You could convert it to a string and create the expression by calling new RegExp():

var myRE = new RegExp (['^(([^<>()[\]\\.,;:\\s@\"]+(\\.[^<>(),[\]\\.,;:\\s@\"]+)*)',
'|(\\".+\\"))@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.',
'[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\\.)+',
'[a-zA-Z]{2,}))$'].join(''));

Notes:

  1. when converting the expression literal to a string you need to escape all backslashes as backslashes are consumed when evaluating a string literal. (See Kayo's comment for more detail.)

  2. RegExp accepts modifiers as a second parameter

    /regex/g => new RegExp('regex', 'g')

[Addition ES20xx (tagged template)]

In ES20xx you can use tagged templates. See the snippet.

Note:

  • Disadvantage here is that you can't use plain whitespace in the regular expression string (always use \s, \s+, \s{1,x}, \t, \n etc).

(() => {
const createRegExp = (str, opts) =>
new RegExp(str.raw[0].replace(/\s/gm, ""), opts || "");
const yourRE = createRegExp`
^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|
(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|
(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$`;
console.log(yourRE);
const anotherLongRE = createRegExp`
(\byyyy\b)|(\bm\b)|(\bd\b)|(\bh\b)|(\bmi\b)|(\bs\b)|(\bms\b)|
(\bwd\b)|(\bmm\b)|(\bdd\b)|(\bhh\b)|(\bMI\b)|(\bS\b)|(\bMS\b)|
(\bM\b)|(\bMM\b)|(\bdow\b)|(\bDOW\b)
${"gi"}`;
console.log(anotherLongRE);
})();

Splitting long regex expression into multiple lines?

You can use the IgnorePatternWhitespace regex option, via its inline form, (?x):

if ($_ -notmatch '(?x)
e_eld\.s|
od_eld\.s|
oe_eld\.s|
of_eld\.s|
og_eld\.s|
1c_eld\.s|
2c_eld\.s|
3c_eld\.s|
4c_eld\.s|
1c_eld\.s|
o2_eld\.s|
o3_eld\.s|
o4_eld\.s|
o5_eld\.s')
{
# stuff
}

Also note that I've translated !(... -match ...) into the simpler ... -notmatch ...; most PowerShell operators have negated forms with -not*.

The purpose of this option is to promote readability of regexes by:

  • allowing you to use whitespace for human-friendly formatting, without that whitespace becoming part of what is to be matched. Whitespace you do want to match you then have to signal explicitly, such as with , [ ], or \s.

  • enabling (single-line) comments, prefixed with #; you then have to escape verbatim # chars. as \#.

A simple example:

# Yields $true
'foo1' -match '(?x)
fo+ # word part
\d? # optional trailing digit
$ # and nothing else
'

Regex / JavaScript: Split string to separate lines by max characters per line with looking n chars backwards for a possible whitespace?

You may use

var s = "Loremipsumissimplydummytextofthe printing and typesetting industry. Loremipsumis simply dummytext ofthe printing and typesetting industry. Loremipsumissimplydummytextofthe printing and typesetting industry. Loremipsumis simply dummytext ofthe printing and typesetting industry.";var regex = /\s*(?:(\S{30})|([\s\S]{1,30})(?!\S))/g;console.log(  s.replace(regex, function($0,$1,$2) { return $1 ? $1 + "-\n" : $2 + "\n"; } ))

How to split long regular expression rules to multiple lines in Python

You can split your regex pattern by quoting each segment. No backslashes needed.

test = re.compile(('(?P<full_path>.+):\d+:\s+warning:\s+Member'
'\s+(?P<member_name>.+)\s+\((?P<member_type>%s)\) '
'of (class|group|namespace)\s+(?P<class_name>.+)'
'\s+is not documented') % (self.__MEMBER_TYPES), re.IGNORECASE)

You can also use the raw string flag 'r' and you'll have to put it before each segment.

See the docs.

How do I split a string with multiple separators in JavaScript?

Pass in a regexp as the parameter:

js> "Hello awesome, world!".split(/[\s,]+/)
Hello,awesome,world!

Edited to add:

You can get the last element by selecting the length of the array minus 1:

>>> bits = "Hello awesome, world!".split(/[\s,]+/)
["Hello", "awesome", "world!"]
>>> bit = bits[bits.length - 1]
"world!"

... and if the pattern doesn't match:

>>> bits = "Hello awesome, world!".split(/foo/)
["Hello awesome, world!"]
>>> bits[bits.length - 1]
"Hello awesome, world!"


Related Topics



Leave a reply



Submit