Java - Regular Expression Finding Comments in Code

How to find (* comments *) using Java's regexp?

A simple pattern to handle that is:

\(\*(.*?)\*\)

Example: http://www.rubular.com/r/afqLCDssIx

You probably also want to set the single-line flag, (?s)\(\*(.*?)\*\)

Note that is doesn't handle cases like (* in strings, or other weird combination. Your best bet is to use a parser, for example ANTLR, which alread has a ready Pascal grammar (direct link).

Regular expression finding empty comments in code

Try this:

\/\*\*[\s\t\r\n]*[ \*]*[\s\t\r\n]*\*/

Should match any string which starts with /**, ends with */ and contains only line breaks, spaces or asterisks in between.

Regular expression for single line java comments

Not A Great Idea, but if you have to...

As you can see in the comments I'm not terribly fond of the idea, but since you asked for it, this will work with your input (see demo):

if\s*\([^\{]*\{(?:[ \t]*//.*)?[ \t]*(?:[\r\n]*[ \t]*(?://.*)?)*[\r\n]*[ \t]*(?://.*)?}
  • The if\s*\([^\{]*\{ gets us to the opening brace
  • The (?:[ \t]*//.*)?[ \t]* gets us to the end of the line, matching an optional comment along the way
  • The (?:[\r\n]*[ \t]*(?://.*)?)* matches a series of lines with optional comments
  • The [\r\n]*[ \t]*(?://.*)?} gets us to the final brace.

Tokens need to be properly escaped, so try this code:

List<String> matchList = new ArrayList<String>();
try {
Pattern regex = Pattern.compile("if\\s*\\([^\\{]*\\{(?:[ \t]*//.*)?[ \t]*(?:[\r\n]*[ \t]*(?://.*)?)*[\r\n]*[ \t]*(?://.*)?}");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}

else if

In the comments you say you may not want else if

In that case, use this:

(?<!else )if\s*\([^\{]*\{(?:[ \t]*//.*)?[ \t]*(?:[\r\n]*[ \t]*(?://.*)?)*[\r\n]*[ \t]*(?://.*)?

In code:

List<String> matchList = new ArrayList<String>();
try {
Pattern regex = Pattern.compile("(?<!else )if\\s*\\([^\\{]*\\{(?:[ \t]*//.*)?[ \t]*(?:[\r\n]*[ \t]*(?://.*)?)*[\r\n]*[ \t]*(?://.*)?");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}

Let me know if you have questions!

Regex to match a C-style multiline comment

Try using this regex (Single line comments only):

String src ="How are things today /* this is comment */ and is your code /* this is another comment */ working?";
String result=src.replaceAll("/\\*.*?\\*/","");//single line comments
System.out.println(result);

REGEX explained:

Match the character "/" literally

Match the character "*" literally

"." Match any single character

"*?" Between zero and unlimited times, as few times as possible, expanding
as needed (lazy)

Match the character "*" literally

Match the character "/" literally

Alternatively here is regex for single and multi-line comments by adding (?s):

//note the added \n which wont work with previous regex
String src ="How are things today /* this\n is comment */ and is your code /* this is another comment */ working?";
String result=src.replaceAll("(?s)/\\*.*?\\*/","");
System.out.println(result);

Reference:

  • https://www.regular-expressions.info/examplesprogrammer.html

regex: Matching multiline comments?

Here, try this:

(\/\*\*)(.|\n)+?(\*\/)

This should do exactly what you want it to do. The first capture group just matches the /**. The second group matches any other character and the + matches any number of that token. The ? makes the search lazy, matching only up to the next occurrence, so we don't match from the start of the first comment to the end of the second comment and everything in between.



Related Topics



Leave a reply



Submit