Regex That Will Match a Java Method Declaration

Regex that Will Match a Java Method Declaration

Have you considered matching the actual possible keywords? such as:

(?:(?:public)|(?:private)|(?:static)|(?:protected)\s+)*

It might be a bit more likely to match correctly, though it might also make the regex harder to read...

Java Regular Expression to match any method signature

This looks more like homework or finger exercise than an actual project. Here is a jumpstart:

^\s*([a-zA-Z_]\w+)\s*(?:\(|\G)(\s*([a-zA-Z_]\w+)\s+([a-zA-Z_]\w+),?)*\)\s*$

Update: The following pattern allows unlimited args (the anchors IMHO not needed):

^\s*(?:^\s*([a-zA-Z_]\w+)\s*\(\s*|\G,\s*)(\s*([a-zA-Z_]\w+)\s+([a-zA-Z_]\w+),?)*\)\s*$

Original demo code:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class T
{
public static void main(String[] args)
{

final String regex = "^\\s*\n"
+ "([a-zA-Z_]\\w+) # function name\n"
+ "\\s*\n"
+ "(?:\\(|\\G)\n"
+ "(\\s*([a-zA-Z_]\\w+)\\s+([a-zA-Z_]\\w+),?)* #args\n"
+ "\\)\n"
+ "\\s*$";

final String string = " _validName__ ( _TypeName _variable)\n"
+ " 7invalidName (_TypeName variable)\n"
+ "badName_ (_pp8p_7 _s5de)\n"
+ " Valid4Name_()\n"
+ " validName(7BadType variable)\n"
+ " validName_(InvalidParam)\n"
+ " validName(Type1 arg1, Type2 arg2)";

final Pattern pattern = Pattern.compile(regex, Pattern.COMMENTS | Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
}
}

Updated Demo

How to match a method block using regex?

Thank all of you. After some consideration, I work out a reliable way to some degree in my situation. Now share it.

String regex ="\\s*public\s+static\s+[\w\.\<\>,\s]+\s+getFieldsConfig\\(.*?\\)\\s*\\{.*?\\}(?=\\s*(public|private|protected|static))";

String regex2 = "\\s*public\s+static\s+[\w\.\<\>,\s]+\s+getFieldsConfig\\(.*?\\)\\s*\\{.*?\\}(?=(\\s*}\\s*$))";

regex = "(" + regex +")|("+ regex2 + "){1}?";

Pattern pattern = Pattern.compile(regex, Pattern.DOTALL)

It can match my method body well.

PS Yes, the regex maybe not the suitable way to parse a method very strictly. Generally speaking, regex is less effort than programming and work right in specific situation. Adjust it and Sure it works for you.

Regex to match a Java method signature

Let's abstract this out a bit, and say we want to match a (possibly empty) list of digits separated by commas.

(empty)
12
12,34
12,34,56

The pattern is therefore

^$|^\d+(,\d+)*$

Now you can try to replace the components to match what you want:

  • Instead of \d+, whatever regex you use to match type name and identifier
  • Maybe allow \s* around the comma
  • Maybe you'd even add the special varargs last argument (which can also be the first and only)

Note that if you allow generic type parameters, then you definitely can't use regex since you can nest the <...> and the language of balanced balanced parentheses of arbitrary depth is not regular.

Although you can argue that in practice, no one would ever nest type parameters deeper than, say, 3 levels, so then it becomes regular again.

That said, a proper parser is really the best tool for this. Just look for implementation of Java grammar, say, in ANTLR.


See also

  • JLS 18.1 The Grammar of the Java Programming Language
  • Java 1.6 Grammar in ANTLR

RegEx that captures a method and its body

You can't do this. It's impossible.

The 'regular' in 'Regular Expression' refers to a certain subset of grammars; the so-called 'Regular Grammars'.

Here's the thing:

  • Non-Regular Grammars cannot be parsed with regular expressions.
  • Java (the language) is Non-Regular.

Thus, you can't use regular expressions for this, QED.

So, how do you parse java?

There are many ways; so far, java is still so-called LL(k) parseable, which means that just about every 'parser/grammar' library out there will be capable of parsing java code, and many such libraries ship with a java grammar as an example. These usually aren't quite perfect, but pretty good.

A basic web search gets you many options. Alternatively, javac is free (but GPL, you'd have to GPL anything you build with it), and ecj (the parser that powers eclipse, amongst other things) is open source with a more permissive license. It's also faster. It's also far harder to use, so there's that.

These are fairly complex tools. However, java is a very complex language (much programming languages are). Parsing them is decidedly non-trivial.

Before you think: Geez, surely it can't be this hard, consider:

public void test {
{}

String x = "{";
}

Which is legal java.

Or:

public void test() {
// method body
\u007D

That really is legal java, that \u007D thing closes it. Of course...

public void test() {
//{} \u007D
}

Here the \u thing doesn't. It is a real closing brace, but, that is in a comment.

Another one to consider:

public void test() {
class Foo {
String y = """
}
""";
}
}

Hopefully, considering the above, you realize you stand absolutely no chance whatsoever unless you use a parser that knows about the entire language spec.

Java Regular Expression for detecting class/interface/etc declaration

Change your regex like below to match both type of string formats.

line.matches("(?:public|protected|private|static)\\s+(?:class|interface)\\s+\\w+\\s*\\{");

Example:

String s1 = "public interface IGame {";
String s2 = "private class Game {";
System.out.println(s1.matches("(?:public|protected|private|static)\\s+(?:class|interface)\\s+\\w+\\s*\\{"));
System.out.println(s2.matches("(?:public|protected|private|static)\\s+(?:class|interface)\\s+\\w+\\s*\\{"));

Output:

true
true

How to list all methods in a Java class with regex

(public|protected|private|static|\s) +[\w\<\>\[\]]+\s+(\w+) *\([^\)]*\) *(\{?|[^;])

With this you can, but search before ask, because i only have used the search to find this answer ^^.



Related Topics



Leave a reply



Submit