Regular Expression to Remove CSS Comments

Using Regex to remove css comments

That would be normally enough (assuming cssLines is a string containing all lines of your CSS file):

 Regex.Replace(cssLines, @"/\*.+?\*/", string.Empty, RegexOptions.Singleline)

Please note that the Singleline option will allow to match multi-line comments.

Regular expression to find and remove comments in CSS

If you're running the match in C#, have you tried RegexOptions?

Match m = Regex.Match(word, pattern, RegexOptions.Multiline);

"Multiline mode. Changes the meaning of ^ and $ so they match at the beginning and end, respectively, of any line, and not just the beginning and end of the entire string."

Also see Strip out C Style Multi-line Comments

EDIT:

OK..looks like an issue w/ the regex. Here is a working example using the regex pattern from http://ostermiller.org/findcomment.html. This guy does a good job deriving the regex, and demonstrating the pitfalls and deficiencies of various approaches. Note: RegexOptions.Multiline/RegexOptions.Singleline does not appear to affect the result.

string input = @"this is some stuff right here
/* blah blah blah
blah blah blah
blah blah blah */ and this is more stuff /* blah */
right here.";

string pattern = @"(/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)";
string output = Regex.Replace(input, pattern, string.Empty, RegexOptions.Singleline);

Remove CSS inside comments using PHP

You can get this code to work by adding the DOTALL-modifier s
http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

$css = preg_replace( '#(/\\* Start Comment \\*/).*?(/\\* End Comment \\*/)#s', '', $css );

Additionally, the * should be escaped to get the desired effect. The / are fine as you are using another delimiter.

error when removing CSS comments via REGEX

Do not use zero-width assertions inside character classes.

  • ^, $, \A, \b, \B, \Z, \z, \G - as anchors, (non-)word boundaries - do not make sense inside character classes since they do not match any character. The ^ and \b mean something different in the character class: ^ is either the negated character class mark if used after the open [ or denotes a literal ^. \b means a backspace char.

  • You can't use \R (=any line break) there, neither.

The two patterns with \A inside a character class must be re-written as a grouping construct, (...), with an alternation operator |:

"`(\A|[\n;]+)/\*.+?\*/`s"=>"$1", 
"`(\A|[;\s]+)//.+\R`"=>"$1\n",

I removed the redundant modifiers and capturing groups you are not using, and replaced [\r\n] with \R. The "`(\A|[\n;]+)/\*.+?\*/`s"=>"$1" can also be re-written in a more efficient way:

"`(\A|[\n;]+)/\*[^*]*\*+(?:[^/*][^*]*\*+)*/`"=>"$1"

Note that in PHP 7.3, acc. to the Upgrade history of the bundled PCRE library table, the regex library is PCRE 10.32. See PCRE to PCRE2 migration:

Until PHP 7.2, PHP used the 8.x versions of the legacy PCRE library, and from PHP 7.3, PHP will use PCRE2. Note that PCRE2 is considered to be a new library although it's based on and largely compatible with PCRE (8.x).

Acc. to this resource, the updated library is more strict to regex patterns, and treats former leniently accepted user errors as real errors now:

  • Modifier S is now on by default. PCRE does some extra optimization.
  • Option X is disabled by default. It makes PCRE do more syntax validation than before.
  • Unicode 10 is used, while it was Unicode 7. This means more emojis, more characters, and more sets. Unicode regex may be impacted.
  • Some invalid patterns may be impacted.

In simple words, PCRE2 is more strict in the pattern validations, so after the upgrade, some of your existing patterns could not compile anymore.

PHP Regex to remove specific CSS Comment

If the rule is that you want to remove all comments where there is no other code on the line then something like this should work:

/^(\s*\/\*.*?\*\/\s*)$/m

The 'm' option makes ^ and $ match the beginning and end of a line. Do you expect the comments to run for more than one line?

EDIT:

I'm pretty sure this fits the bill:

/(^|\n)\s*\/\*.*?\*\/\s*/s

Do you understand what it's doing?

sed command to strip CSS comments not working

This task is doable with regexes. However, if you use a line-oriented tool, then it becomes unnecessarily difficult to do. This task is yelling at me, like accidental complexity!

I wouldn't push this any further. Here is an npm module for this so you can add it to your builds. Here is an online css minifier so you can use it ad-hoc.

I don't know what kind of site you're building. However, a css preprocessor might simplify your work anyway. Here is a good overview.

How to remove C-style comments from code

I've considered the comments (so far) and changed the regex to:

(?:\/\/(?:\\\n|[^\n])*\n)|(?:\/\*[\s\S]*?\*\/)|((?:R"([^(\\\s]{0,16})\([^)]*\)\2")|(?:@"[^"]*?")|(?:"(?:\?\?'|\\\\|\\"|\\\n|[^"])*?")|(?:'(?:\\\\|\\'|\\\n|[^'])*?'))

It handles Biffens C++11's raw string literal (as well as C# verbatim strings) and it's changed according to Wiktors suggestions.

Split it to handling single and double quotes separately because of difference in logic (and avoiding the non-working back reference ;).

It's undoubtedly more complex, but still far from the solutions I've seen out there which hardly cover any of the string issues. And it could be stripped of parts not applicable to a specific language.

One comment suggested supporting more languages. That would make the RE (even more) complex and unmanageable. It should be relatively easy to adapt though.

Updated regex101 example.

Thanks everyone for the input so far. And keep the suggestions coming.

Regards

Edit: Update Raw String - this time I actually read the spec. ;)



Related Topics



Leave a reply



Submit