Java Lambdas 20 Times Slower Than Anonymous Classes

Java lambdas 20 times slower than anonymous classes

You are obviously encountering the first-time initialization overhead of lambda expressions. As already mentioned in the comments, the classes for lambda expressions are generated at runtime rather than being loaded from your class path.

However, being generated isn’t the cause for the slowdown. After all, generating a class having a simple structure can be even faster than loading the same bytes from an external source. And the inner class has to be loaded too. But when the application hasn’t used lambda expressions before¹, even the framework for generating the lambda classes has to be loaded (Oracle’s current implementation uses ASM under the hood). This is the actual cause of the slowdown, loading and initialization of a dozen internally used classes, not the lambda expression itself².

You can easily verify this. In your current code using lambda expressions, you have two identical expressions (i1, i2) -> Integer.compare(i1.start, i2.start). The current implementation doesn’t recognize this (actually, the compiler doesn’t provide a hint neither). So here, two lambda instances, having even different classes, are generated. You can refactor the code to have only one comparator, similar to your inner class variant:

final Comparator<? super Interval> comparator
= (i1, i2) -> Integer.compare(i1.start, i2.start);
int start = Collections.binarySearch(intervals, newInterval, comparator);
int skip = start >= 0 ? start : -start - 1;
int end = Collections.binarySearch(intervals.subList(skip, intervals.size()),
new Interval(newInterval.end, 0),
comparator);

You won’t notice any significant performance difference, as it’s not the number of lambda expressions that matters, but just the class loading and initialization of the framework, which happens exactly once.

You can even max it out by inserting additional lambda expressions like

final Comparator<? super Interval> comparator1
= (i1, i2) -> Integer.compare(i1.start, i2.start);
final Comparator<? super Interval> comparator2
= (i1, i2) -> Integer.compare(i1.start, i2.start);
final Comparator<? super Interval> comparator3
= (i1, i2) -> Integer.compare(i1.start, i2.start);
final Comparator<? super Interval> comparator4
= (i1, i2) -> Integer.compare(i1.start, i2.start);
final Comparator<? super Interval> comparator5
= (i1, i2) -> Integer.compare(i1.start, i2.start);

without seeing any slowdown. It’s really the initial overhead of the very first lambda expression of the entire runtime you are noticing here. Since Leetcode itself apparently doesn’t use lambda expressions before entering your code, whose execution time gets measured, this overhead adds to your execution time here.

See also “How will Java lambda functions be compiled?” and “Does a lambda expression create an object on the heap every time it's executed?”

¹ This implies that JDK code that will be executed before handing control over to your application doesn’t use lambda expressions itself. Since this code stems from times before the introduction of lambda expressions, this is usually the case. With newer JDKs, modular software will be initialized by different, newer code, which seems to use lambda expressions, so the initialization of the runtime facility can’t be measured within the application anymore in these setups.

² The initialization time has been reduced significantly in newer JDKs. There are different possible causes, general performance improvements, dedicated lambda optimizations, or both. Improving initialization time in general, is an issue that the JDK developers did not forget.

Is there a performance gain by using lambda expressions?

method 1 optimization:

you don't need 2 loops and you can return immediately when you have a match, so you stop the traversing of the list at that point - (e.g. sunshine case you get a match on first element - you have 1 iteration, worst case your match is the last element - you have to traverse the whole list to hit that match)

private boolean func01 (List<String> list1, List<String> list2) {
for (String group : list1) {
if (list2.contains(group)) return true;
}

return false;
}

lambda equivalent optimization:

  • findAny().isPresent() - get an optional of an element matching the predicate and check if the Optional is present - this is equivalent of anyMatch() because both expressions return boolean

  • filter() will always traverse the whole list

  • anyMatch() has a short-circuit behavior - means it will stop on the first match

so you can re-write it as:

private boolean func02 (List<String> list1, List<String> list2) {
return list1.stream().anyMatch(list2::contains);
}

To answer your question - there's no significant performance difference in the two approaches, but take into consideration:

  • creating stream for a collection has a slight overhead

  • stream operations can be run in parallel (list1.stream().parallel().anyMatch(list2::contains)). For example in this case anyMatch() running in parallel threads over the same stream will periodically check if the previous threads found a match and will stop traversing the collection and not continue traversing the whole collection. So in theory for significantly large input lists, you should get better results with parallel streams.

Lambda expressions slower?

The first time a lambda is used, the JVM has to generate the byte code of the class and load it. The first time any lambda is used, most of the lambda code generating library is loaded.

I have found that using a Lambda Comparator can be slower. It is likely that in future version this will not be the case. More importantly, lambda code definitely needs to be loaded and warmed up to be fast.

Your code isn't run long enough to say. I would warm the code up for at least 2 - 10 seconds before taking any measurement.

When you use a Comparator to sort a large list it can be called N log2 N times which is a lot. Any inefficiency with show up as significant.

Lambda vs anonymous inner class performance: reducing the load on the ClassLoader?

Oracle has a presentation covering some of the performance differences. It appears that there are quite a few factors that impact performance of lambdas vs. anonymous classes.

http://www.oracle.com/technetwork/java/jvmls2013kuksen-2014088.pdf

Performance difference between Java 8 lambdas and anonymous inner classes

Oracle has posted a study comparing performance between Lambdas and anonymous classes

See JDK 8: Lambda Performance Study by Sergey Kuksenko, which is 74 slides long.

Summary: slow to warm up but when JIT inlines it worst case just as fast as anonymous class but can be faster.

Java lambdas have different variable requirements than anonymous inner classes

The chapter on lambda expression bodies states

Unlike code appearing in anonymous class declarations, the meaning of
names and the this and super keywords appearing in a lambda body,
along with the accessibility of referenced declarations, are the same
as in the surrounding context
(except that lambda parameters introduce
new names).

The transparency of this (both explicit and implicit) in the body of a
lambda expression - that is, treating it the same as in the
surrounding context - allows more flexibility for implementations, and
prevents the meaning of unqualified names in the body from being
dependent on overload resolution.

They're more strict because of that.

The surrounding context, in this case, is an assignment to a field and the issue at hand is an access of a field, val, a blank final field, in the right hand side of the expression.

The Java Language Specification states

Each local variable (§14.4) and every blank final field (§4.12.4,
§8.3.1.2) must have a definitely assigned value when any access of its
value occurs.

An access to its value consists of the simple name of the variable
(or, for a field, the simple name of the field qualified by this)
occurring anywhere in an expression except as the left-hand operand of
the simple assignment operator = (§15.26.1).

For every access of a local variable or blank final field x, x must be
definitely assigned before the access, or a compile-time error occurs.

It then goes on to say

Let C be a class, and let V be a blank final non-static member field
of C, declared in C. Then:

  • V is definitely unassigned (and moreover is not definitely assigned) before the leftmost instance initializer (§8.6) or instance variable
    initializer of C.

  • V is [un]assigned before an instance initializer or instance variable initializer of C other than the leftmost iff V is
    [un]assigned after the preceding instance initializer or instance
    variable initializer of C.

Your code basically looks like this

private final int val;
// leftmost instance variable initializer, val still unassigned
private final Callable<String> anonInnerGetValString = ...
// still unassigned after preceding variable initializer
private final Callable<String> lambdaGetValString = ...

The compiler therefore determines that val in unassigned when it's accessed within the initialization expression for lambdaGetValString.

Java: does lambda usage has negative influence on performance or memory use

In terms of memory not much difference.

In terms of performance, it all depends on the level of optimization that is performed by the JIT compiler, which depends on many things, such as the Jvm being used, the Jvm version, and the frequency and count of how often the code is run. In the early days of lambdas they were slower because Jvm developers had not yet been able to optimize their performance. Even today I would guess that the loop is fastest, then the anonymous class, then the lambda, but as I said with enough optimization in the compiled code (method inclining, stack allocation, etc) you may end up with the same thing.

Is there any runtime benefit of using lambda expression in Java?

lambdas do NOT create a new scope, they share the same scope as the enclosing block/ environment” is an almost correct statement (they do create a new scope, but not the way inner classes do), but doesn’t have anything to do with runtime performance. This has to do with correctness of the code.

Within an anonymous inner class, identifiers may get resolved through the lexical scope, finding a match in the surrounding scope, or by inheritance, finding a match in the class hierarchy of the anonymous inner class. The rules for resolving identifiers in this scenario are complex and easy to confuse.

Further, the body of an anonymous class creates a new scope that allows to create variables having the same name as local variables of the surrounding context, shadowing these variables.

In contrast, a lambda expression works like other expressions within the context they are written in. They do not inherit any members from the functional interface they will get converted to, they can not create new variables shadowing existing local variables and even this and super have the same meaning as within the surrounding context:

JLS§15.27.2. Lambda Body

Unlike code appearing in anonymous class declarations, the meaning of names and the this and super keywords appearing in a lambda body, along with the accessibility of referenced declarations, are the same as in the surrounding context (except that lambda parameters introduce new names).

So when you have the expressions x.foo(y) and () -> x.foo(y) within the same block, it will be obvious whether x and y will be the same x and y for both expressions and hence, it will be the same foo method in each case, which you can not say that simple for anonymous inner classes, as you have to analyze the entire inner class and its type hierarchy first.

This makes lambda expressions ideal for scenarios where you want to define a local function and, e.g. pass it to a method as a parameter, without even thinking about the actual interface being used. The interface itself does not influence the lambda expression beyond defining the functional signature.

But this also implies that there might be use cases of anonymous classes that can’t be covered by lambda expressions. But the purpose of lambda expressions isn’t to be a general replacement for anonymous inner classes.


When it comes to performance or easy of parallel processing, shmosel’s answer says it already. We can’t make such general statements without knowing which operation/problem we are looking at and which solutions we are actually comparing.



Related Topics



Leave a reply



Submit