If/Else Statements in Antlr Using Listeners

If/else statements in ANTLR using listeners

By default, ANTLR 4 generates listeners. But if you give org.antlr.v4.Tool the command line parameter -visitor, ANTLR generates visitor classes for you. These work much like listeners, but give you more control over which (sub) trees are walked/visited. This is particularly useful if you want to exclude certain (sub) trees (like else/if blocks, as in your case). While this can be done using listeners, it's much cleaner to do this with a visitor. Using listeners, you'll need to introduce global variables that keep track if a (sub) tree needs to be evaluated, and which do not.

As it happens to be, I'm working on a small ANTLR 4 tutorial. It's not done yet, but I'll post a small working example that demonstrates the use of these visitor classes and an if statement construct.



1. Grammar

Here's a simple grammar supporting basic expressions, if-, while- and log-statements:

Mu.g4

grammar Mu;

parse
: block EOF
;

block
: stat*
;

stat
: assignment
| if_stat
| while_stat
| log
| OTHER {System.err.println("unknown char: " + $OTHER.text);}
;

assignment
: ID ASSIGN expr SCOL
;

if_stat
: IF condition_block (ELSE IF condition_block)* (ELSE stat_block)?
;

condition_block
: expr stat_block
;

stat_block
: OBRACE block CBRACE
| stat
;

while_stat
: WHILE expr stat_block
;

log
: LOG expr SCOL
;

expr
: expr POW<assoc=right> expr #powExpr
| MINUS expr #unaryMinusExpr
| NOT expr #notExpr
| expr op=(MULT | DIV | MOD) expr #multiplicationExpr
| expr op=(PLUS | MINUS) expr #additiveExpr
| expr op=(LTEQ | GTEQ | LT | GT) expr #relationalExpr
| expr op=(EQ | NEQ) expr #equalityExpr
| expr AND expr #andExpr
| expr OR expr #orExpr
| atom #atomExpr
;

atom
: OPAR expr CPAR #parExpr
| (INT | FLOAT) #numberAtom
| (TRUE | FALSE) #booleanAtom
| ID #idAtom
| STRING #stringAtom
| NIL #nilAtom
;

OR : '||';
AND : '&&';
EQ : '==';
NEQ : '!=';
GT : '>';
LT : '<';
GTEQ : '>=';
LTEQ : '<=';
PLUS : '+';
MINUS : '-';
MULT : '*';
DIV : '/';
MOD : '%';
POW : '^';
NOT : '!';

SCOL : ';';
ASSIGN : '=';
OPAR : '(';
CPAR : ')';
OBRACE : '{';
CBRACE : '}';

TRUE : 'true';
FALSE : 'false';
NIL : 'nil';
IF : 'if';
ELSE : 'else';
WHILE : 'while';
LOG : 'log';

ID
: [a-zA-Z_] [a-zA-Z_0-9]*
;

INT
: [0-9]+
;

FLOAT
: [0-9]+ '.' [0-9]*
| '.' [0-9]+
;

STRING
: '"' (~["\r\n] | '""')* '"'
;

COMMENT
: '#' ~[\r\n]* -> skip
;

SPACE
: [ \t\r\n] -> skip
;

OTHER
: .
;

Now let's say you would like to parse, and evaluate, input like this:

test.mu

a = true;
b = false;

if a && b {
log "1 :: a=" + a +", b=" + b;
}
else if a || b {
log "2 :: a=" + a +", b=" + b;
}
else {
log "3 :: a=" + a +", b=" + b;
}

log "Done!";


2. Visitor I

Start by generating the parser and visitor classes:

java -cp antlr-4.0-complete.jar org.antlr.v4.Tool Mu.g4 -visitor

The command above would have generated, among others the file MuBaseVisitor<T>. This is the class we're going to extend with out own logic:

EvalVisitor.java

public class EvalVisitor extends MuBaseVisitor<Value> {
// ...
}

where Value is just a wrapper for any of our language's types (String, Boolean, Double):

Value.java

public class Value {

public static Value VOID = new Value(new Object());

final Object value;

public Value(Object value) {
this.value = value;
}

public Boolean asBoolean() {
return (Boolean)value;
}

public Double asDouble() {
return (Double)value;
}

public String asString() {
return String.valueOf(value);
}

public boolean isDouble() {
return value instanceof Double;
}

@Override
public int hashCode() {

if(value == null) {
return 0;
}

return this.value.hashCode();
}

@Override
public boolean equals(Object o) {

if(value == o) {
return true;
}

if(value == null || o == null || o.getClass() != this.getClass()) {
return false;
}

Value that = (Value)o;

return this.value.equals(that.value);
}

@Override
public String toString() {
return String.valueOf(value);
}
}


3. Test I

To test the classes, use the following Main class:

Main.java

import org.antlr.v4.runtime.ANTLRFileStream;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.tree.ParseTree;

public class Main {
public static void main(String[] args) throws Exception {
MuLexer lexer = new MuLexer(new ANTLRFileStream("test.mu"));
MuParser parser = new MuParser(new CommonTokenStream(lexer));
ParseTree tree = parser.parse();
EvalVisitor visitor = new EvalVisitor();
visitor.visit(tree);
}
}

and compile and run the source files:

javac -cp antlr-4.0-complete.jar *.java
java -cp .:antlr-4.0-complete.jar Main

(on Windows, the last command would be: java -cp .;antlr-4.0-complete.jar Main)

After running Main, nothing happens (of course?). This is because we didn't implement any of the rules in our EvalVisitor class. To be able to evaluate the file test.mu properly, we need to provide a proper implementation for the following rules:

  • if_stat
  • andExpr
  • orExpr
  • plusExpr
  • assignment
  • idAtom
  • booleanAtom
  • stringAtom
  • log

#4. Visitor II & Test II

Here's a implementation of these rules:

import org.antlr.v4.runtime.misc.NotNull;

import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class EvalVisitor extends MuBaseVisitor<Value> {

// used to compare floating point numbers
public static final double SMALL_VALUE = 0.00000000001;

// store variables (there's only one global scope!)
private Map<String, Value> memory = new HashMap<String, Value>();

// assignment/id overrides
@Override
public Value visitAssignment(MuParser.AssignmentContext ctx) {
String id = ctx.ID().getText();
Value value = this.visit(ctx.expr());
return memory.put(id, value);
}

@Override
public Value visitIdAtom(MuParser.IdAtomContext ctx) {
String id = ctx.getText();
Value value = memory.get(id);
if(value == null) {
throw new RuntimeException("no such variable: " + id);
}
return value;
}

// atom overrides
@Override
public Value visitStringAtom(MuParser.StringAtomContext ctx) {
String str = ctx.getText();
// strip quotes
str = str.substring(1, str.length() - 1).replace("\"\"", "\"");
return new Value(str);
}

@Override
public Value visitNumberAtom(MuParser.NumberAtomContext ctx) {
return new Value(Double.valueOf(ctx.getText()));
}

@Override
public Value visitBooleanAtom(MuParser.BooleanAtomContext ctx) {
return new Value(Boolean.valueOf(ctx.getText()));
}

@Override
public Value visitNilAtom(MuParser.NilAtomContext ctx) {
return new Value(null);
}

// expr overrides
@Override
public Value visitParExpr(MuParser.ParExprContext ctx) {
return this.visit(ctx.expr());
}

@Override
public Value visitPowExpr(MuParser.PowExprContext ctx) {
Value left = this.visit(ctx.expr(0));
Value right = this.visit(ctx.expr(1));
return new Value(Math.pow(left.asDouble(), right.asDouble()));
}

@Override
public Value visitUnaryMinusExpr(MuParser.UnaryMinusExprContext ctx) {
Value value = this.visit(ctx.expr());
return new Value(-value.asDouble());
}

@Override
public Value visitNotExpr(MuParser.NotExprContext ctx) {
Value value = this.visit(ctx.expr());
return new Value(!value.asBoolean());
}

@Override
public Value visitMultiplicationExpr(@NotNull MuParser.MultiplicationExprContext ctx) {

Value left = this.visit(ctx.expr(0));
Value right = this.visit(ctx.expr(1));

switch (ctx.op.getType()) {
case MuParser.MULT:
return new Value(left.asDouble() * right.asDouble());
case MuParser.DIV:
return new Value(left.asDouble() / right.asDouble());
case MuParser.MOD:
return new Value(left.asDouble() % right.asDouble());
default:
throw new RuntimeException("unknown operator: " + MuParser.tokenNames[ctx.op.getType()]);
}
}

@Override
public Value visitAdditiveExpr(@NotNull MuParser.AdditiveExprContext ctx) {

Value left = this.visit(ctx.expr(0));
Value right = this.visit(ctx.expr(1));

switch (ctx.op.getType()) {
case MuParser.PLUS:
return left.isDouble() && right.isDouble() ?
new Value(left.asDouble() + right.asDouble()) :
new Value(left.asString() + right.asString());
case MuParser.MINUS:
return new Value(left.asDouble() - right.asDouble());
default:
throw new RuntimeException("unknown operator: " + MuParser.tokenNames[ctx.op.getType()]);
}
}

@Override
public Value visitRelationalExpr(@NotNull MuParser.RelationalExprContext ctx) {

Value left = this.visit(ctx.expr(0));
Value right = this.visit(ctx.expr(1));

switch (ctx.op.getType()) {
case MuParser.LT:
return new Value(left.asDouble() < right.asDouble());
case MuParser.LTEQ:
return new Value(left.asDouble() <= right.asDouble());
case MuParser.GT:
return new Value(left.asDouble() > right.asDouble());
case MuParser.GTEQ:
return new Value(left.asDouble() >= right.asDouble());
default:
throw new RuntimeException("unknown operator: " + MuParser.tokenNames[ctx.op.getType()]);
}
}

@Override
public Value visitEqualityExpr(@NotNull MuParser.EqualityExprContext ctx) {

Value left = this.visit(ctx.expr(0));
Value right = this.visit(ctx.expr(1));

switch (ctx.op.getType()) {
case MuParser.EQ:
return left.isDouble() && right.isDouble() ?
new Value(Math.abs(left.asDouble() - right.asDouble()) < SMALL_VALUE) :
new Value(left.equals(right));
case MuParser.NEQ:
return left.isDouble() && right.isDouble() ?
new Value(Math.abs(left.asDouble() - right.asDouble()) >= SMALL_VALUE) :
new Value(!left.equals(right));
default:
throw new RuntimeException("unknown operator: " + MuParser.tokenNames[ctx.op.getType()]);
}
}

@Override
public Value visitAndExpr(MuParser.AndExprContext ctx) {
Value left = this.visit(ctx.expr(0));
Value right = this.visit(ctx.expr(1));
return new Value(left.asBoolean() && right.asBoolean());
}

@Override
public Value visitOrExpr(MuParser.OrExprContext ctx) {
Value left = this.visit(ctx.expr(0));
Value right = this.visit(ctx.expr(1));
return new Value(left.asBoolean() || right.asBoolean());
}

// log override
@Override
public Value visitLog(MuParser.LogContext ctx) {
Value value = this.visit(ctx.expr());
System.out.println(value);
return value;
}

// if override
@Override
public Value visitIf_stat(MuParser.If_statContext ctx) {

List<MuParser.Condition_blockContext> conditions = ctx.condition_block();

boolean evaluatedBlock = false;

for(MuParser.Condition_blockContext condition : conditions) {

Value evaluated = this.visit(condition.expr());

if(evaluated.asBoolean()) {
evaluatedBlock = true;
// evaluate this block whose expr==true
this.visit(condition.stat_block());
break;
}
}

if(!evaluatedBlock && ctx.stat_block() != null) {
// evaluate the else-stat_block (if present == not null)
this.visit(ctx.stat_block());
}

return Value.VOID;
}

// while override
@Override
public Value visitWhile_stat(MuParser.While_statContext ctx) {

Value value = this.visit(ctx.expr());

while(value.asBoolean()) {

// evaluate the code block
this.visit(ctx.stat_block());

// evaluate the expression
value = this.visit(ctx.expr());
}

return Value.VOID;
}
}

When you re-compile and run Main, the following would be printed to your console:

2 :: a=true, b=false
Done!

For an implementation of all other rules, see: https://github.com/bkiers/Mu

EDIT

From @pwwpche, in the comments:

for those using jdk1.8 and encounter IndexOutOfBoundsException, antlr 4.0
is somehow not compatible with jdk1.8. Download antlr-4.6-complete.jar, and
replace expr POW<assoc=right> expr with <assoc=right>expr POW expr will
eliminate the error and warnings.

Interpreting IF statements in ANTLR

If you're using Python, then you should be able to use lambda expressions -- instead returning a value in parser actions, you can return a lambda expression and evaluate it only when neccessary.

Antlr4 Listener subtree check condition

I don't know Go, but in Java you could do something like this:

// In this example, the grammar is called `T.g4`
class WhileLastMoveListener extends TBaseListener {

private boolean insideWhileCondition = false;

@Override
public void enterWhile_condition(TParser.While_conditionContext ctx) {
this.insideWhileCondition = true;
}

@Override
public void exitWhile_condition(TParser.While_conditionContext ctx) {
this.insideWhileCondition = false;
}

@Override
public void enterF_lastmove(TParser.F_lastmoveContext ctx) {
if (this.insideWhileCondition) {
// Found a `f_lastmove` rule inside a while `while_condition`
}
}
}

Antlr4 Listeners and Visitors - which to implement?

If you plan to directly use the parser output for interpretation, the visitor is a good choice. You have full control of the traversal, so in conditionals only one branch is visited, loops can be visited n times and so on.

If you translate the input to a lower level, e.g. virtual machine instructions, both patterns may be useful.

You might take a look at "Language Implementation Patterns", which covers the basic interpreter implementations.

I mostly use the visitor pattern, as it's more flexible.



Related Topics



Leave a reply



Submit