Regex for Match/Replacing JavaScript Comments (Both Multiline and Inline)

RegEx for match/replacing JavaScript comments (both multiline and inline)

try this,

(\/\*[\w\'\s\r\n\*]*\*\/)|(\/\/[\w\s\']*)|(\<![\-\-\s\w\>\/]*\>)

should work :)
Sample Image

replace single line javascript comments with multiline style comments in Notepad++ using regular expressions

Quick way

Use this regex:

(?<=\s)//([^\n\r]*)

Replace with:

/\*$1\*/

Explanatory way

1 - Double slashes // weren't going to replace, because you had them in that capturing group. So you'll capture and replace them again.

2 - Most of the time there is nothing but a space or a newline (\n) before comments begin. I included this as a lookbehind to ensure. By this way, URLs and DOCTYPE won't get touched.

* I don't confirm this searching and replacing method, however it may work with most cases.

Update

Take care of settings. You should have you cursor at the very beginning of the file content.

Sample Image

Then do a Replace All

Sample Image

Regex pattern to match anything between /* and */ in js

Because the . does not match line breaks. Try this instead:

string.match(/\/\*([\s\S]*?)\*\//g);

I also made the * ungreedy otherwise the first /* and last */ would be matched. I.e., the entire input:

/*         
<h1>some text</h1>
<div>more stuff $${}{} etc </div>
*/
var somethingImportant = 42;
/*
...
*/

Realize that you will goof up on input like this though:

var s = " ... /* ... ";

RegEx for match/replacing JavaScript comments (both multiline and inline)

try this,

(\/\*[\w\'\s\r\n\*]*\*\/)|(\/\/[\w\s\']*)|(\<![\-\-\s\w\>\/]*\>)

should work :)
Sample Image

Perl regex to find and replace php multiline comments

\/\*[\s\S]*?\*\/
  • \/\* match /*
  • [\s\S]*? match anything (non-greedy)
  • until \*\/; match */

Regex to replace block comment with line comment

If you are interested in a single regex pass in the modern JavaScript engine (and other regex engines supporting infinite length patterns in lookbehinds), you can use

/(?<=^(\/)\*(?:(?!^\/\*)[\s\S])*?\r?\n)(?=[\s\S]*?^\*\/)|(?:\r?\n)?(?:^\/\*|^\*\/)/gm

Replace with $1$1, see the regex demo.

Details

  • (?<=^(\/)\*(?:(?!^\/\*)[\s\S])*?\r?\n) - a positive lookbehind that matches a location that is immediately preceded with
    • ^(\/)\* - /* substring at the start of a line (with / captured into Group 1)
    • (?:(?!^\/\*)[\s\S])*? - any char, zero or more occurrences, as few as possible, not starting a /* char sequence that appears at the start of a line
    • \r?\n - a CRLF or LF ending
  • (?=[\s\S]*?^\*\/) - a positive lookahead that requires any 0 or more chars as few as possible followed with */ at the start of a line, immediately to the right of the current location
  • | - or
  • (?:\r?\n)? - an optional CRLF or LF linebreak
  • (?:^\/\*|^\*\/) - and then either /* or */ at the start of a line.

Comprehensive RegExp to remove JavaScript comments

I like challenges :)

Here's my working solution:

/((["'])(?:\\[\s\S]|.)*?\2|\/(?![*\/])(?:\\.|\[(?:\\.|.)\]|.)*?\/)|\/\/.*?$|\/\*[\s\S]*?\*\//gm

Replace that with $1.

Fiddle here: http://jsfiddle.net/LucasTrz/DtGq8/6/

Of course, as it has been pointed out countless times, a proper parser would probably be better, but still...

NB: I used a regex literal in the fiddle insted of a regex string, too much escaping can destroy your brain.


Breakdown

((["'])(?:\\[\s\S]|.)*?\2|\/(?![*\/])(?:\\.|\[(?:\\.|.)\]|.)*?\/) <-- the part to keep
|\/\/.*?$ <-- line comments
|\/\*[\s\S]*?\*\/ <-- inline comments

The part to keep

(["'])(?:\\[\s\S]|.)*?\2                   <-- strings
\/(?![*\/])(?:\\.|\[(?:\\.|.)\]|.)*?\/ <-- regex literals

Strings

    ["']              match a quote and capture it
(?:\\[\s\S]|.)*? match escaped characters or unescpaed characters, don't capture
\2 match the same type of quote as the one that opened the string

Regex literals

    \/                          match a forward slash
(?![*\/]) ... not followed by a * or / (that would start a comment)
(?:\\.|\[(?:\\.|.)\]|.)*? match any sequence of escaped/unescaped text, or a regex character class
\/ ... until the closing slash

The part to remove

|\/\/.*?$              <-- line comments
|\/\*[\s\S]*?\*\/ <-- inline comments

Line comments

    \/\/         match two forward slashes
.*?$ then everything until the end of the line

Inline comments

    \/\*         match /*
[\s\S]*? then as few as possible of anything, see note below
\*\/ match */

I had to use [\s\S] instead of . because unfortunately JavaScript doesn't support the regex s modifier (singleline - this one allows . to match newlines as well)

This regex will work in the following corner cases:

  • Regex patterns containing / in character classes: /[/]/
  • Escaped newlines in string literals

Final boss fight

And just for the fun of it... here's the eye-bleeding hardcore version:

/((["'])(?:\\[\s\S]|.)*?\2|(?:[^\w\s]|^)\s*\/(?![*\/])(?:\\.|\[(?:\\.|.)\]|.)*?\/(?=[gmiy]{0,4}\s*(?![*\/])(?:\W|$)))|\/\/.*?$|\/\*[\s\S]*?\*\//gm

This adds the following twisted edge case (fiddle, regex101):

Code = /* Comment */ /Code regex/g  ; // Comment
Code = Code / Code /* Comment */ /g ; // Comment
Code = /Code regex/g /* Comment */ ; // Comment

This is highly heuristical code, you probably shouldn't use it (even less so than the previous regex) and just let that edge case blow.

Regex to perform global match on javascript block comments

This is one of those "ugh, yeah, of course!" moments.

The exec() function will generate an array with 1 element, being the matched element. Except it doesn't, the first element is the full match, which is great unless there are capture groups. If there are, then in additional to result[0] being the full pattern match, result[1] will be the first capture group, result[2] the second, and so on.

For example:

  1. (/l/g).exec("l") gives us ["l"]
  2. (/(l)/g).exec("l") gives us ["l", "l"]

You RE isn't so much the problem (although running the string through a stream filter that takes out block comments is probably easier to work with) as it's more a case of the assumption that you can just use .join() on the exec results that's been causing you problems. If you have capture groups, and you have a result, join results.slice(1), or call results.splice(1,0) before joining to get rid of the leading element, so you don't accidentally include the full match.



Related Topics



Leave a reply



Submit