boolean expression parser in java
I've coded this using Javaluator.
It's not exactly the output you are looking for, but I think it could be a start point.
package test;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import net.astesana.javaluator.*;
public class TreeBooleanEvaluator extends AbstractEvaluator<String> {
/** The logical AND operator.*/
final static Operator AND = new Operator("&&", 2, Operator.Associativity.LEFT, 2);
/** The logical OR operator.*/
final static Operator OR = new Operator("||", 2, Operator.Associativity.LEFT, 1);
private static final Parameters PARAMETERS;
static {
// Create the evaluator's parameters
PARAMETERS = new Parameters();
// Add the supported operators
PARAMETERS.add(AND);
PARAMETERS.add(OR);
// Add the parentheses
PARAMETERS.addExpressionBracket(BracketPair.PARENTHESES);
}
public TreeBooleanEvaluator() {
super(PARAMETERS);
}
@Override
protected String toValue(String literal, Object evaluationContext) {
return literal;
}
private boolean getValue(String literal) {
if ("T".equals(literal) || literal.endsWith("=true")) return true;
else if ("F".equals(literal) || literal.endsWith("=false")) return false;
throw new IllegalArgumentException("Unknown literal : "+literal);
}
@Override
protected String evaluate(Operator operator, Iterator<String> operands,
Object evaluationContext) {
List<String> tree = (List<String>) evaluationContext;
String o1 = operands.next();
String o2 = operands.next();
Boolean result;
if (operator == OR) {
result = getValue(o1) || getValue(o2);
} else if (operator == AND) {
result = getValue(o1) && getValue(o2);
} else {
throw new IllegalArgumentException();
}
String eval = "("+o1+" "+operator.getSymbol()+" "+o2+")="+result;
tree.add(eval);
return eval;
}
public static void main(String[] args) {
TreeBooleanEvaluator evaluator = new TreeBooleanEvaluator();
doIt(evaluator, "T && ( F || ( F && T ) )");
doIt(evaluator, "(T && T) || ( F && T )");
}
private static void doIt(TreeBooleanEvaluator evaluator, String expression) {
List<String> sequence = new ArrayList<String>();
evaluator.evaluate(expression, sequence);
System.out.println ("Evaluation sequence for :"+expression);
for (String string : sequence) {
System.out.println (string);
}
System.out.println ();
}
}
Here is the ouput:
Evaluation sequence for :T && ( F || ( F && T ) )
(F && T)=false
(F || (F && T)=false)=false
(T && (F || (F && T)=false)=false)=falseEvaluation sequence for :(T && T) || ( F && T )
(T && T)=true
(F && T)=false
((T && T)=true || (F && T)=false)=true
Parse Boolean Expression in Java
The first step is to tokenize the input. Define
enum Token {VAR, LP, RP, NOT, AND, OR, END}
LP
and RP
are parentheses. Now define a tokenizer class that looks something like this:
class Tokenizer {
Tokenizer(String input) {...}
void reset() {...}
Token getNext() {...}
String getVarName() {...}
}
Calling getNext()
on your example in a loop should return
LP LP NOT VAR AND LP LP NOT VAR OR VAR RP OR LP VAR OR VAR RP RP RP OR LP VAR AND NOT VAR RP RP AND VAR END
Calling getVarName()
immediately after a VAR
has been returned by getNext()
gives you the name of the variable (e.g. "t42"
).
There are many ways to implement little scanners like this. You should do this first and make sure it's bulletproof by testing. Trying to build a parser on top of a flaky scanner is torture.
As I said in comments, I'd consider recursive descent parsing. If you have a suitable grammar, writing an RD parser is a very short step as the Dragon Book (also mentioned above) shows.
A reasonable grammar (using tokens as above) is
Expr -> Term AND Term
| Term OR Term
| Term END
Term -> NOT Term
| Opnd
Opnd -> VAR
| LP Expr RP
For example, here is how you'd get started. It shows the first rule converted to a function:
class Evaluator {
final Tokenizer tokenizer = ...; // Contains the expression text.
final Map<String, Boolean> env = ... // Environment: variables to values.
Token lookAhead; // Holds the token we're parsing right now.
Evaluator(Tokenizer tokenizer, Map<String, Boolean> env) { ... }
void advance() {
lookAhead = tokenizer.getNext();
}
boolean expr() {
boolean leftHandSide = term(); // Parse the left hand side recursively.
Token op = lookAhead; // Remember the operator.
if (op == Token.END) return leftHandSide; // Oops. That's all.
advance(); // Skip past the operator.
boolean rightHandSide = term(); // Parse the right hand side recursively.
if (op == Token.AND) return leftHandSide && rightHandSide; // Evaluate!
if (op == Token.OR) return leftHandSide || rightHandSide;
dieWithSyntaxError("Expected op, found " + op);
}
boolean term() {...}
boolean opnd() {...}
}
The environment is used when a VAR is parsed. Its boolean value is env.get(tokenizer.getVarName())
.
So to process the file, you'll
For each line
For each variable tX in the expression
See if the line contains the string tX is bound to in its text field.
If so, put the mapping tX -> true in the environment
else put tX -> false
Reset the tokenizer
Call Evaluator.evaluate(tokenizer, environment)
If it returns true, print the line, else skip it.
This is the simplest approach I can think of. About 150 lines. Many optimizations are possible.
Added
Well since I can no longer take away the thrill of discovery, here is my version:
import static java.lang.Character.isDigit;
import static java.lang.Character.isWhitespace;
import java.util.HashMap;
import java.util.Map;
import static java.util.stream.Collectors.toMap;
public class TextExpressionSearch {
enum Token { VAR, LP, RP, NOT, AND, OR, END }
static class Tokenizer {
final String input;
int pos = 0;
String var;
Tokenizer(String input) {
this.input = input;
}
void reset() {
pos = 0;
var = null;
}
String getRead() {
return input.substring(0, pos);
}
Token getNext() {
var = null;
while (pos < input.length() && isWhitespace(input.charAt(pos))) {
++pos;
}
if (pos >= input.length()) {
return Token.END;
}
int start = pos++;
switch (input.charAt(start)) {
case 't':
while (pos < input.length() && isDigit(input.charAt(pos))) {
++pos;
}
var = input.substring(start, pos);
return Token.VAR;
case '(':
return Token.LP;
case ')':
return Token.RP;
case 'n':
if (input.startsWith("ot", pos)) {
pos += 2;
return Token.NOT;
}
break;
case 'a':
if (input.startsWith("nd", pos)) {
pos += 2;
return Token.AND;
}
break;
case 'o':
if (input.startsWith("r", pos)) {
pos += 1;
return Token.OR;
}
break;
}
throw new AssertionError("Can't tokenize: " + input.substring(start));
}
}
static class Evaluator {
final Tokenizer tokenizer;
final Map<String, Boolean> env;
Token lookAhead;
Evaluator(Tokenizer tokenizer, Map<String, Boolean> env) {
this.tokenizer = tokenizer;
this.env = env;
advance();
}
boolean die(String msg) {
throw new AssertionError(msg + "\nRead: " + tokenizer.getRead());
}
void advance() {
lookAhead = tokenizer.getNext();
}
void match(Token token) {
if (lookAhead != token) {
die("Expected " + token + ", found " + lookAhead);
}
advance();
}
boolean evaluate() {
boolean exprVal = expr();
match(Token.END);
return exprVal;
}
boolean expr() {
boolean lhs = negated();
switch (lookAhead) {
case AND:
advance();
return negated() && lhs;
case OR:
advance();
return negated() || lhs;
case END:
return lhs;
}
return die("Expected expr, found " + lookAhead);
}
boolean negated() {
switch (lookAhead) {
case NOT:
advance();
return !negated();
default:
return operand();
}
}
boolean operand() {
switch (lookAhead) {
case VAR:
if (!env.containsKey(tokenizer.var)) {
die("Undefined variable: " + tokenizer.var);
}
boolean varVal = env.get(tokenizer.var);
advance();
return varVal;
case LP:
advance();
boolean exprVal = expr();
match(Token.RP);
return exprVal;
}
return die("Expected operand, found " + lookAhead);
}
}
public static void main(String [] args) {
String expr = "((not t1 and ((not t2 or t4) or (t3 or t4))) or (t5 and not t6)) and t7";
Map<String, String> bindings = new HashMap<>();
bindings.put("t1", "str1");
bindings.put("t2", "str2");
bindings.put("t3", "str3");
bindings.put("t4", "str4");
bindings.put("t5", "str5");
bindings.put("t6", "str6");
bindings.put("t7", "str7");
String [] lines = {"str5 str7", "str4 str2"};
Tokenizer tokenizer = new Tokenizer(expr);
for (String line : lines) {
Map<String, Boolean> env =
bindings.entrySet().stream()
.collect(toMap(e -> e.getKey(), e -> line.contains(e.getValue())));
tokenizer.reset();
if (new Evaluator(tokenizer, env).evaluate()) {
System.out.println(line);
}
}
}
}
Parse parenthesized Boolean String Expressions with Java
I found a solution using JavaScript:
String boolexpr1 = "(!(true||false)&&!((true)^(true)))"
boolean result = (Boolean) new ScriptEngineManager().getEngineByName("javascript").eval(boolexpr1);
Didn't expect it could be that simple.
Boolean Expression Evaluation in Java
You could use the scripting engine in Java6 and the choose any of the popular scripting languages like Scala, Ruby, Python, Groovy, and Javascript. Than all you have to do is make sure the expression you want to evaluate is in the right language. Groovy is probably the easiest and will integrate best.
I have used this method successfully for a feature offering capabilities much like a formula / calculated column in a popular spreadsheet application.
Nested Boolean Expression Parser using ANTLR
I'd just wrap all the expressions into a single expression
rule. Be sure to define the comparator
expressions alternative before your binary
expression alternative to make sure comparator
operators bind more tightly than OR
and AND
:
grammar SimpleBoolean;
parse
: expression EOF
;
expression
: LPAREN expression RPAREN #parenExpression
| NOT expression #notExpression
| left=expression op=comparator right=expression #comparatorExpression
| left=expression op=binary right=expression #binaryExpression
| bool #boolExpression
| IDENTIFIER #identifierExpression
| DECIMAL #decimalExpression
;
comparator
: GT | GE | LT | LE | EQ
;
binary
: AND | OR
;
bool
: TRUE | FALSE
;
AND : 'AND' ;
OR : 'OR' ;
NOT : 'NOT';
TRUE : 'TRUE' ;
FALSE : 'FALSE' ;
GT : '>' ;
GE : '>=' ;
LT : '<' ;
LE : '<=' ;
EQ : '=' ;
LPAREN : '(' ;
RPAREN : ')' ;
DECIMAL : '-'? [0-9]+ ( '.' [0-9]+ )? ;
IDENTIFIER : [a-zA-Z_] [a-zA-Z_0-9]* ;
WS : [ \r\t\u000C\n]+ -> skip;
To test the grammar above, use the following quick-and-dirty test classes:
public class Main {
public static void main(String[] args) throws Exception {
Map<String, Object> variables = new HashMap<String, Object>() {{
put("A", true);
put("a", true);
put("B", false);
put("b", false);
put("C", 42.0);
put("c", 42.0);
put("D", -999.0);
put("d", -1999.0);
put("E", 42.001);
put("e", 142.001);
put("F", 42.001);
put("f", 42.001);
put("G", -1.0);
put("g", -1.0);
}};
String[] expressions = {
"1 > 2",
"1 >= 1.0",
"TRUE = FALSE",
"FALSE = FALSE",
"A OR B",
"B",
"A = B",
"c = C",
"E > D",
"B OR (c = B OR (A = A AND c = C AND E > D))",
"(A = a OR B = b OR C = c AND ((D = d AND E = e) OR (F = f AND G = g)))"
};
for (String expression : expressions) {
SimpleBooleanLexer lexer = new SimpleBooleanLexer(new ANTLRInputStream(expression));
SimpleBooleanParser parser = new SimpleBooleanParser(new CommonTokenStream(lexer));
Object result = new EvalVisitor(variables).visit(parser.parse());
System.out.printf("%-70s -> %s\n", expression, result);
}
}
}
class EvalVisitor extends SimpleBooleanBaseVisitor<Object> {
private final Map<String, Object> variables;
public EvalVisitor(Map<String, Object> variables) {
this.variables = variables;
}
@Override
public Object visitParse(SimpleBooleanParser.ParseContext ctx) {
return super.visit(ctx.expression());
}
@Override
public Object visitDecimalExpression(SimpleBooleanParser.DecimalExpressionContext ctx) {
return Double.valueOf(ctx.DECIMAL().getText());
}
@Override
public Object visitIdentifierExpression(SimpleBooleanParser.IdentifierExpressionContext ctx) {
return variables.get(ctx.IDENTIFIER().getText());
}
@Override
public Object visitNotExpression(SimpleBooleanParser.NotExpressionContext ctx) {
return !((Boolean)this.visit(ctx.expression()));
}
@Override
public Object visitParenExpression(SimpleBooleanParser.ParenExpressionContext ctx) {
return super.visit(ctx.expression());
}
@Override
public Object visitComparatorExpression(SimpleBooleanParser.ComparatorExpressionContext ctx) {
if (ctx.op.EQ() != null) {
return this.visit(ctx.left).equals(this.visit(ctx.right));
}
else if (ctx.op.LE() != null) {
return asDouble(ctx.left) <= asDouble(ctx.right);
}
else if (ctx.op.GE() != null) {
return asDouble(ctx.left) >= asDouble(ctx.right);
}
else if (ctx.op.LT() != null) {
return asDouble(ctx.left) < asDouble(ctx.right);
}
else if (ctx.op.GT() != null) {
return asDouble(ctx.left) > asDouble(ctx.right);
}
throw new RuntimeException("not implemented: comparator operator " + ctx.op.getText());
}
@Override
public Object visitBinaryExpression(SimpleBooleanParser.BinaryExpressionContext ctx) {
if (ctx.op.AND() != null) {
return asBoolean(ctx.left) && asBoolean(ctx.right);
}
else if (ctx.op.OR() != null) {
return asBoolean(ctx.left) || asBoolean(ctx.right);
}
throw new RuntimeException("not implemented: binary operator " + ctx.op.getText());
}
@Override
public Object visitBoolExpression(SimpleBooleanParser.BoolExpressionContext ctx) {
return Boolean.valueOf(ctx.getText());
}
private boolean asBoolean(SimpleBooleanParser.ExpressionContext ctx) {
return (boolean)visit(ctx);
}
private double asDouble(SimpleBooleanParser.ExpressionContext ctx) {
return (double)visit(ctx);
}
}
Running the Main
class will result in the following output:
1 > 2 -> false
1 >= 1.0 -> true
TRUE = FALSE -> false
FALSE = FALSE -> true
A OR B -> true
B -> false
A = B -> false
c = C -> true
E > D -> true
B OR (c = B OR (A = A AND c = C AND E > D)) -> true
(A = a OR B = b OR C = c AND ((D = d AND E = e) OR (F = f AND G = g))) -> true
evaluate boolean expression in java generate at runtime
Use http://docs.codehaus.org/display/JANINO/Home for minimum work. I can do much more than simple expressions.
Antlr4 parser for boolean logic
Both are equivalent in their ability to parse your sample input. If you simplify your input by removing the unnecessary parentheses, the output of this grammar looks pretty good too:
grammar filter;
filter: expression EOF;
expression
: LPAREN expression RPAREN
| expression (AND expression)+
| expression (OR expression)+
| atom;
atom : INT;
INT: DIGITS;
DIGITS : [0-9]+;
AND : 'AND';
OR : 'OR';
LPAREN : '(';
RPAREN : ')';
WS: [ \t\r\n]+ -> skip;
Which is what I suspect your first grammar looks like in its entirety.
Your second one requires too many parentheses for my liking (mainly in term
), and the breaking up of AND and OR into separate rules instead of alternatives doesn't seem as clean to me.
You can simplify even more though:
grammar filter;
filter: expression EOF;
expression
: LPAREN expression RPAREN # ParenExp
| expression AND expression # AndBlock
| expression OR expression # OrBlock
| atom # AtomExp
;
atom : INT;
INT: DIGITS;
DIGITS : [0-9]+;
AND : 'AND';
OR : 'OR';
LPAREN : '(';
RPAREN : ')';
WS: [ \t\r\n]+ -> skip;
This gives a tree with a different shape but still is equivalent. And note the use of the # AndBlock
and # OrBlock
labels... these "alternative labels" will cause your generated listener or visitor to have separate methods for each, allowing you to completely separate these two in your code semantically as well as syntactically. Perhaps that's what you're looking for?
I like this one the best because it's the simplest and clearer recursion, and offers specific code alternatives for AND and OR.
Related Topics
Port of Random Generator from C to Java
Java 'Final' Method: What Does It Promise
Jpa: What Is the Proper Pattern for Iterating Over Large Result Sets
Disable Spring Security for Options Http Method
Does Java Have Buffer Overflows
Use Java and Regex to Convert Casing in a String
How to Convert/Parse from String to Char in Java
Find a Class Somewhere Inside Dozens of Jar Files
How to Convert Pojo to JSON and Vice Versa
Differencebetween an Ordered and a Sorted Collection
Random "Element Is No Longer Attached to the Dom" Staleelementreferenceexception
How to Change the Size of the Font of a Jlabel to Take the Maximum Size
How to Write an Arraylist of Strings into a Text File
What's the Difference Between # , % and $ Signs in Struts Tags
Java, How to Get Current Index/Key in "For Each" Loop