Regex to match a C-style multiline comment
Try using this regex (Single line comments only):
String src ="How are things today /* this is comment */ and is your code /* this is another comment */ working?";
String result=src.replaceAll("/\\*.*?\\*/","");//single line comments
System.out.println(result);
REGEX explained:
Match the character "/" literally
Match the character "*" literally
"." Match any single character
"*?" Between zero and unlimited times, as few times as possible, expanding
as needed (lazy)Match the character "*" literally
Match the character "/" literally
Alternatively here is regex for single and multi-line comments by adding (?s):
//note the added \n which wont work with previous regex
String src ="How are things today /* this\n is comment */ and is your code /* this is another comment */ working?";
String result=src.replaceAll("(?s)/\\*.*?\\*/","");
System.out.println(result);
Reference:
- https://www.regular-expressions.info/examplesprogrammer.html
regex: Matching multiline comments?
Here, try this:
(\/\*\*)(.|\n)+?(\*\/)
This should do exactly what you want it to do. The first capture group just matches the /**
. The second group matches any other character and the +
matches any number of that token. The ?
makes the search lazy, matching only up to the next occurrence, so we don't match from the start of the first comment to the end of the second comment and everything in between.
Matching Multiline C++ style comments using Regex
Here is the working regex for your provided sample from the regexr.com
\/\*+((([^\*])+)|([\*]+(?!\/)))[*]+\/
or:
\/\*.*?\*\/
Improving/Fixing a Regex for C style block comments
Some problems I see with your regex:
There's no need for the |[\r\n]
sequences in your regex; a negated character class like [^*]
matches everything except *
, including line separators. It's only the .
(dot) metacharacter that doesn't match those.
Once you're inside the comment, the only character you have to look for is an asterisk; as long as you don't see one of those, you can gobble up as many characters you want. That means it makes no sense to use [^*]
when you can use [^*]+
instead. In fact, you might as well put that in an atomic group -- (?>[^*]+)
-- because you'll never have any reason to give up any of those not-asterisks once you've matched them.
Filtering out extraneous junk, the final alternative inside your outermost parens is \*+[^*/]
, which means "one or more asterisks, followed by a character that isn't an asterisk or a slash". That will always match the asterisk at the end of the comment, and it will always have to give it up again because the next character is a slash. In fact, if there are twenty asterisks leading up to the final slash, that part of your regex will match them all, then it will give them all up, one by one. Then the final part -- \*+/
-- will match them for keeps.
For maximum performance, I would use this regex:
/\*(?>(?:(?>[^*]+)|\*(?!/))*)\*/
This will match a well-formed comment very quickly, but more importantly, if it starts to match something that isn't a valid comment, it will fail as quickly as possible.
Courtesy of David, here's a version that matches nested comments with any level of nesting:
(?s)/\*(?>/\*(?<LEVEL>)|\*/(?<-LEVEL>)|(?!/\*|\*/).)+(?(LEVEL)(?!))\*/
It uses .NET's Balancing Groups, so it won't work in any other flavor. For the sake of completeness, here's another version (from RegexBuddy's Library) that uses the Recursive Groups syntax supported by Perl, PCRE and Oniguruma/Onigmo:
/\*(?>[^*/]+|\*[^/]|/[^*])*(?>(?R)(?>[^*/]+|\*[^/]|/[^*])*)*\*/
How to filter out c-type comments with regex?
We can try doing a regex replacement on the following pattern:
/\*.*?\*/
This matches any old-school C style comment. It works by using a lazy dot .*?
to match only content within a single comment, before the end of that comment. We can then replace with empty string, to effectively remove these comments from the input.
Code:
Dim input As String = "/* 1111 */ one /*2222*/two /*3333 */ three/* 4444*/ four /*/**/ five /**/"
Dim output As String = Regex.Replace(input, "/\*.*?\*/", "")
Console.WriteLine(input)
Console.WriteLine(output)
This prints:
one two three four five
C++ regex search for multi-line comments (between /* */)
Make the regular expression only match up to the first occurrence of */
by turning the quantifier *
into its non-greedy version. This is accomplished by adding a question mark after it:
std::regex rx("/\\*(.*?)\\*/");
Related Topics
How to Directly Initialize a Hashmap (In a Literal Way)
Why Doesn't Java Offer Operator Overloading
Conditionally Ignoring Tests in Junit 4
Which Is the Best Library for Xml Parsing in Java
How to Convert Strings to and from Utf8 Byte Arrays in Java
Should I Initialize Variable Within Constructor or Outside Constructor
Look and Feel Is Not Updating in Swing Jtabbedpane
Integer Wrapper Objects Share the Same Instances Only Within the Value 127
Java.Lang.Unsupportedclassversionerror: Bad Version Number in .Class File
Java String Split with "." (Dot)
Slf4J: Failed to Load Class "Org.Slf4J.Impl.Staticloggerbinder"
How to Implement Constants in Java
How to Construct a Relative Path in Java from Two Absolute Paths (Or Urls)
How to Know If Other Threads Have Finished
Where Is Array's Length Property Defined
No Compiler Is Provided in This Environment. Perhaps You Are Running on a Jre Rather Than a Jdk