Java:Parse Java Source Code, Extract Methods

Java : parse java source code, extract methods

Download the java parser from https://javaparser.org/

You'll have to write some code. This code will invoke the parser... it will return you a CompilationUnit:

            InputStream in = null;
CompilationUnit cu = null;
try
{
in = new SEDInputStream(filename);
cu = JavaParser.parse(in);
}
catch(ParseException x)
{
// handle parse exceptions here.
}
finally
{
in.close();
}
return cu;

Note: SEDInputStream is a subclass of input stream. You can use a FileInputStream if you want.


You'll have to create a visitor. Your visitor will be easy because you're only interested in methods:

  public class MethodVisitor extends VoidVisitorAdapter
{
public void visit(MethodDeclaration n, Object arg)
{
// extract method information here.
// put in to hashmap
}
}

To invoke the visitor, do this:

  MethodVisitor visitor = new MethodVisitor();
visitor.visit(cu, null);

Extract methods calls from java code

You need a Java Code Parser for this task. Here is an example which uses Java Parser:

public class MethodCallPrinter
{
public static void main(String[] args) throws Exception
{
FileInputStream in = new FileInputStream("MethodCallPrinter.java");

CompilationUnit cu;
try
{
cu = JavaParser.parse(in);
}
finally
{
in.close();
}
new MethodVisitor().visit(cu, null);
}

private static class MethodVisitor extends VoidVisitorAdapter
{
@Override
public void visit(MethodCallExpr methodCall, Object arg)
{
System.out.print("Method call: " + methodCall.getName() + "\n");
List<Expression> args = methodCall.getArgs();
if (args != null)
handleExpressions(args);
}

private void handleExpressions(List<Expression> expressions)
{
for (Expression expr : expressions)
{
if (expr instanceof MethodCallExpr)
visit((MethodCallExpr) expr, null);
else if (expr instanceof BinaryExpr)
{
BinaryExpr binExpr = (BinaryExpr)expr;
handleExpressions(Arrays.asList(binExpr.getLeft(), binExpr.getRight()));
}
}
}
}
}

Output:

Method call: parse
Method call: close
Method call: visit
Method call: print
Method call: getName
Method call: getArgs
Method call: handleExpressions
Method call: visit
Method call: handleExpressions
Method call: asList
Method call: getLeft
Method call: getRight

JavaScript: Parse Java source code, extract method

The AST is just another JSON object. Try jsonpath.

npm install jsonpath

To extract all methods, just filter on condition node=="MethodDeclaration":

var jp = require('jsonpath');
var methods = jp.query(ast, '$.types..[?(@.node=="MethodDeclaration")]');
console.log(methods);

See here for more JSON path syntax.

How do I parse a Java file to retrieve its function names?

You could use ANTLR Java grammar https://github.com/antlr/grammars-v4/blob/master/java8/Java8.g4 to get a full-blown Java parser and then use it to extract any information you need about Java files.

Static code parser for Java source code to extract methods / comments

You can use ASTParser by eclipse. Its super simple to use.

Find a quick standalone example here.

Parsing Java Source Code

I'd go with Antlr and use an existing Java grammar: https://github.com/antlr/grammars-v4

Get method as a file from .java file

@SanzidaSultana, I've provided the code to extract the methods from a class...This uses regex to extract method from class file. But, this implementation has limitation. Anyway, it'll help you solve your problems. I am just providing the code below(with example).

The parser I've provided, it's very easy to use. All you need is the JavaMethodParser class. Just use it to extract method(with it's name!). See below:

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

import java.util.Collections;
import java.util.List;
import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
* Note: it is very simple parser
* as simple as possible
* it's assuming Foo.java is correctly written
* it can only track a few errors in Foo.java
* is not written in proper format
* so, keep that in mind
*
*
*/

class JavaMethodParser {
private List<String> methodList;
private List<String> methodNameList;
private int cnt;

public JavaMethodParser(String classContents) {
Pattern classPattern = Pattern.compile("[a-zA-Z]*\\s*class\\s+([_a-zA-Z]+)\\s*\\{(.*)}$", Pattern.DOTALL);

// now match
Matcher classMatcher = classPattern.matcher(classContents);

if(classMatcher.find()) {
String methodContents = classMatcher.group(2);

// now parse the methods
Pattern methodPattern = Pattern.compile("[a-zA-Z]*\\s*[a-zA-Z]+\\s+([_a-zA-Z]+)\\s*?\\(.*?\\)\\s*?\\{", Pattern.DOTALL);

Matcher methodMatcher = methodPattern.matcher(methodContents);

List<String> methodStartList = new ArrayList<>();

// creating method list and methodName list
methodList = new ArrayList<>();
methodNameList = new ArrayList<>();

while(methodMatcher.find()) {
String methodName = methodMatcher.group(1);
String methodStart = methodMatcher.group();
methodStartList.add(methodStart);
methodNameList.add(methodName);
}

// reversing, cause it'll be easier to split
// methods from the end of methodContents
Collections.reverse(methodStartList);
Collections.reverse(methodNameList);

String buf = methodContents;
int i=0;
for(String methodStart: methodStartList) {
String[] t = buf.split(Pattern.quote(methodStart));
String method = methodStart + t[1];

methodList.add(method);

buf = t[0];
i++;
}
} else {
System.out.println("error: class not found");
// throw error, cause not even a class
// or do whatever you think necessary
}

// initializing cnt
cnt = -1;
}
public boolean hasMoreMethods() {
cnt += 1; // cause, cnt is initialized with -1
return cnt < methodList.size();
}
public String getMethodName() {
return methodNameList.get(cnt);
}
public String getMethod() {
return methodList.get(cnt);
}
public int countMethods() {
return methodList.size();
}
}

public class SOTest {
public static void main(String[] args) {
try {
Scanner in = new Scanner(new File("Foo.java"));
String classContents = in.useDelimiter("\\Z").next().trim();

JavaMethodParser jmp = new JavaMethodParser(classContents);

while(jmp.hasMoreMethods()) {
System.out.println("name: " + jmp.getMethodName());
System.out.println("definition:\n" + jmp.getMethod());
System.out.println();
}

in.close();
} catch(FileNotFoundException e) {
e.printStackTrace();
}
}
}

The input to this program is Foo.java which is written as below:

public class Foo {

private int u, v;
private String x;

public int add(int a, int b) {
if(a + b < 0) {
return 0;
}
return a + b;
}

public int sub(int a, int b) {
return a - b;
}
}

And the output is:

name: sub
definition:
public int sub(int a, int b) {
return a - b;
}

name: add
definition:
public int add(int a, int b) {
if(a + b < 0) {
return 0;
}
return a + b;
}

I guess, you know how to write something in file using java. So, I will left that part to you...

[P.S.]: If anything is unclear to you, let me know in the comment section...also provide feedback if its working or not for you...

How to extract metadata from Java methods using AST?

This class was written a long time ago.. It was actually about four different classes - spread out in a package called JavaParserBridge. It tremendously simplifies what you are trying to do. I have stripped out all the unneccessary stuff, and boiled it down to 100 lines. It took about an hour...

I hope this all makes sense. I usually add a lot of comments to code, but sometimes when dealing with other libraries - and posting on Stack Overflow - since this is literally just one big constructor - I will leave you with the documentation page for Java Parser

To use this class, just pass the source-code file for a Java Class as a single java.lang.String, and the method named getMethods(String) will return a Java Vector<Method>. Each element of the returned Vector will have an instance of Method which shall have all of the Meta Information that you requested in your question.

IMPORTANT: You can get the JAR File for this package off of the github page. You need the JAR named:
javaparser-core-3.16.2.jar

import com.github.javaparser.StaticJavaParser;
import com.github.javaparser.ast.CompilationUnit;
import com.github.javaparser.ast.body.TypeDeclaration;
import com.github.javaparser.ast.body.MethodDeclaration;
import com.github.javaparser.ast.body.Parameter;
import com.github.javaparser.ast.type.ReferenceType;
import com.github.javaparser.ast.type.TypeParameter;
import com.github.javaparser.ast.Node;
import com.github.javaparser.ast.NodeList;
import com.github.javaparser.ast.Modifier; // Modifiers are the key-words such as "public, private, static, etc..."
import com.github.javaparser.printer.lexicalpreservation.LexicalPreservingPrinter;
import com.github.javaparser.printer.lexicalpreservation.PhantomNodeLogic;

import java.io.IOException;
import java.util.Vector;

public class Method
{
public final String name, signature, jdComment, body, returnType;
public final String[] modifiers, parameterNames, parameterTypes, exceptions;

private Method (MethodDeclaration md)
{

NodeList<Parameter> paramList = md.getParameters();
NodeList<ReferenceType> exceptionList = md.getThrownExceptions();
NodeList<Modifier> modifiersList = md.getModifiers();

this.name = md.getNameAsString();
this.signature = md.getDeclarationAsString();
this.jdComment = (md.hasJavaDocComment() ? md.getJavadocComment().get().toString() : null);
this.returnType = md.getType().toString();
this.modifiers = new String[modifiersList.size()];
this.parameterNames = new String[paramList.size()];
this.parameterTypes = new String[paramList.size()];
this.exceptions = new String[exceptionList.size()];
this.body = (md.getBody().isPresent()
? LexicalPreservingPrinter.print
(LexicalPreservingPrinter.setup(md.getBody().get()))
: null);

int i=0;
for (Modifier modifier : modifiersList) modifiers[i++] = modifier.toString();

i=0;
for (Parameter p : paramList)
{
parameterNames[i] = p.getName().toString();
parameterTypes[i] = p.getType().toString();
i++;
}

i=0;
for (ReferenceType r : exceptionList) this.exceptions[i++] = r.toString();
}

public static Vector<Method> getMethods(String sourceFileAsString) throws IOException
{
// This is the "Return Value" for this method (a Vector)
final Vector<Method> methods = new Vector<>();

// This asks Java Parser to parse the source code file
// The String-parameter 'sourceFileAsString' should have this

CompilationUnit cu = StaticJavaParser.parse(sourceFileAsString);

// This will "walk" all of the methods that were parsed by
// StaticJavaParser, and retrieve the method information.
// The method information is stored in a class simply called "Method"

cu.walk(MethodDeclaration.class, (MethodDeclaration md) -> methods.add(new Method(md)));

// There is one important thing to do: clear the cache
// Memory leaks shall occur if you do not.

PhantomNodeLogic.cleanUpCache();

// return the Vector<Method>
return methods;
}
}


Related Topics



Leave a reply



Submit