Fastest way to iterate an Array in Java: loop variable vs enhanced for statement
If you're looping through an array, it shouldn't matter - the enhanced for loop uses array accesses anyway.
For example, consider this code:
public static void main(String[] args)
{
for (String x : args)
{
System.out.println(x);
}
}
When decompiled with javap -c Test
we get (for the main
method):
public static void main(java.lang.String[]);
Code:
0: aload_0
1: astore_1
2: aload_1
3: arraylength
4: istore_2
5: iconst_0
6: istore_3
7: iload_3
8: iload_2
9: if_icmpge 31
12: aload_1
13: iload_3
14: aaload
15: astore 4
17: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream;
20: aload 4
22: invokevirtual #3; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
25: iinc 3, 1
28: goto 7
31: return
Now change it to use an explicit array access:
public static void main(String[] args)
{
for (int i = 0; i < args.length; i++)
{
System.out.println(args[i]);
}
}
This decompiles to:
public static void main(java.lang.String[]);
Code:
0: iconst_0
1: istore_1
2: iload_1
3: aload_0
4: arraylength
5: if_icmpge 23
8: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream;
11: aload_0
12: iload_1
13: aaload
14: invokevirtual #3; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
17: iinc 1, 1
20: goto 2
23: return
There's a bit more setup code in the enhanced for loop, but they're basically doing the same thing. No iterators are involved. Furthermore, I'd expect them to get JITted to even more similar code.
Suggestion: if you really think it might make a significant difference (which it would only ever do if the body of the loop is absolutely miniscule) then you should benchmark it with your real application. That's the only situation which matters.
looping over array, performance difference between indexed and enhanced for loop
TL;DR: You are observing what happens when JIT compiler cannot trust that values
are not changing inside the loop. Additionally, in the tiny benchmark like this, Blackhole.consume
costs dominate, obscuring the results.
Simplifying the test:
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
public class JdkBenchmarks {
public int[] values;
@Setup
public void setupArray() {
int count = 1000;
values = new int[count];
for(int i = 0; i < count; i++) {
values[i] = i;
}
}
@Benchmark
@CompilerControl(CompilerControl.Mode.DONT_INLINE)
public void indexed(Blackhole bh) {
for(int i = 0; i < values.length; i++) {
bh.consume(values[i]);
}
}
@Benchmark
@CompilerControl(CompilerControl.Mode.DONT_INLINE)
public void indexed_cached(Blackhole bh) {
int[] vs = values;
int length = vs.length;
for(int i = 0; i < length; i++) {
bh.consume(vs[i]);
}
}
@Benchmark
@CompilerControl(CompilerControl.Mode.DONT_INLINE)
public void enhanced(Blackhole bh) {
for (int value : values) {
bh.consume(value);
}
}
}
Running both enhanced
and indexed_cached
under -prof perfasm
reveals this hot loop (I specifically did @CompilerControl(DONT_INLINE)
to let @Benchmark
method be compiled alone, which makes an easier to digest perfasm
output):
↗ 0x...4240: mov 0x10(%r8,%rsi,4),%r10d ; load values[i], blackhole it
22.68% │ 0x...4245: mov 0x14(%r8,%rsi,4),%r11d ; ... repeat 7 more times...
│ 0x...424a: mov 0x18(%r8,%rsi,4),%r10d ;
20.95% │ 0x...424f: mov 0x1c(%r8,%rsi,4),%r10d ;
0.02% │ 0x...4254: mov 0x20(%r8,%rsi,4),%r11d ;
24.73% │ 0x...4259: mov 0x24(%r8,%rsi,4),%r10d ;
0.24% │ 0x...425e: mov 0x28(%r8,%rsi,4),%r11d ;
20.04% │ 0x...4263: mov 0x2c(%r8,%rsi,4),%r10d ;
0.22% │ 0x...4268: add $0x8,%esi ; i += 8
│ 0x...426b: cmp %ebp,%esi ; compare i with length (in %ebp)
0.26% ╰ 0x...426d: jl 0x...4240 ; circle back if 8 more elements available
Very efficient!
Running indexed
with -prof perfasm
reveals:
↗ 0x...4170: mov 0xc(%r12,%r8,8),%r9d ; array bounds check, load values.length
3.42% │ 0x...4175: cmp %r9d,%r10d ; array bounds check, compare i
16.02% │ 0x...4178: jae 0x...41b1 ; failed? jump to exception handling
│ 0x...417a: lea (%r12,%r8,8),%r11 ; load values[i], part 1
0.04% │ 0x...417e: mov 0x10(%r11,%r10,4),%r11d ; load values[i], part 2
│ ; %r11d is blackholed
35.69% │ 0x...4183: mov 0xc(%rsi),%r8d ; get "values"
0.71% │ 0x...4187: mov 0x348(%r15),%r11 ; safepoint poll, part 1 (JVM infra)
4.03% │ 0x...418e: inc %r10d ; i++
0.12% │ 0x...4191: test %eax,(%r11) ; safepoint poll, part 2 (JVM infra)
27.74% │ 0x...4194: mov 0xc(%r12,%r8,8),%r9d ; load values.length
8.53% │ 0x...4199: cmp %r9d,%r10d ; check i < values.length
0.24% ╰ 0x...419c: jl 0x...4170 ; circle back if more
This is because Blackhole.consume
call is opaque to the compiler (like many other non-inlined calls), so it has to conservatively presume that values
can change in the middle of the loop!
Which means, compiler cannot stash values
in a register, it cannot trust the array bounds check to always succeed, it cannot even guarantee the loop terminates (hence safepoint polls), and on top of that, loop unrolling does not want to multiply that per-element mess even more.
So you get the penalty like this (TR 3970X, JDK 17.0.2 EA, Linux x86_64):
Benchmark Mode Cnt Score Error Units
JdkBenchmarks.enhanced avgt 5 144.962 ± 0.918 ns/op
JdkBenchmarks.indexed avgt 5 1030.981 ± 3.775 ns/op ; + 880 ns/op!
JdkBenchmarks.indexed_cached avgt 5 144.799 ± 0.643 ns/op ; same as enhanced
Additional fun part:
On most JDKs the dominating costs are the costs of calling the Blackhole.consume
in this test. Compared to the cost of array access, the cost of Java-style Blackhole
is quite bad. With JDK 17+ and JMH 1.34, the compiler Blackholes
would be used, and thus provide much more fidelity for the test.
Without compiler blackholes, the effect hides in the Blackhole
overhead nearly completely (>25x overhead means we can execute a lot of bad code preceding the Blackhole
call!):
Benchmark Mode Cnt Score Error Units
JdkBenchmarks.enhanced avgt 5 4062.866 ± 4.736 ns/op
JdkBenchmarks.indexed avgt 5 4071.620 ± 1.057 ns/op ; + 10 ns/op [whoops]
JdkBenchmarks.indexed_cached avgt 5 4061.390 ± 0.692 ns/op ; same as enhanced
It would re-manifest if we drop @CompilerControl(DONT_INLINE)
, because the resulting generated code would be much messier:
Benchmark Mode Cnt Score Error Units
JdkBenchmarks.enhanced avgt 5 4067.118 ± 40.699 ns/op
JdkBenchmarks.indexed avgt 5 4601.370 ± 0.632 ns/op ; + 530 ns/op
JdkBenchmarks.indexed_cached avgt 5 4064.455 ± 1.554 ns/op ; same as enhanced
Is there a performance difference between a for loop and a for-each loop?
From Item 46 in Effective Java by Joshua Bloch :
The for-each loop, introduced in
release 1.5, gets rid of the clutter
and the opportunity for error by
hiding the iterator or index variable
completely. The resulting idiom
applies equally to collections and
arrays:// The preferred idiom for iterating over collections and arrays
for (Element e : elements) {
doSomething(e);
}
When you see the colon (:), read it as
“in.” Thus, the loop above reads as
“for each element e in elements.” Note
that there is no performance penalty
for using the for-each loop, even for
arrays. In fact, it may offer a slight
performance advantage over an ordinary
for loop in some circumstances, as it
computes the limit of the array index
only once. While you can do this by
hand (Item 45), programmers don’t
always do so.
why is the enhanced for loop more efficient than the normal for loop
It's a bit of an oversimplification to say that the enhanced for loop is more efficient. It can be, but in many cases it's almost exactly the same as an old-school loop.
The first thing to note is that for collections the enhanced for loop uses an Iterator
, so if you manually iterate over a collection using an Iterator
then you should have pretty much the same performance than the enhanced for loop.
One place where the enhanced for loop is faster than a naively implemented traditional loop is something like this:
LinkedList<Object> list = ...;
// Loop 1:
int size = list.size();
for (int i = 0; i<size; i++) {
Object o = list.get(i);
/// do stuff
}
// Loop 2:
for (Object o : list) {
// do stuff
}
// Loop 3:
Iterator<Object> it = list.iterator();
while (it.hasNext()) {
Object o = it.next();
// do stuff
}
In this case Loop 1 will be slower than both Loop 2 and Loop 3, because it will have to (partially) traverse the list in each iteration to find the element at position i
. Loop 2 and 3, however will only ever step one element further in the list, due to the use of an Iterator
. Loop 2 and 3 will also have pretty much the same performance since Loop 3 is pretty much exactly what the compiler will produce when you write the code in Loop 2.
How does the enhanced for statement work for arrays, and how to get an iterator for an array?
If you want an Iterator
over an array, you could use one of the direct implementations out there instead of wrapping the array in a List
. For example:
Apache Commons Collections ArrayIterator
Or, this one, if you'd like to use generics:
com.Ostermiller.util.ArrayIterator
Note that if you want to have an Iterator
over primitive types, you can't, because a primitive type can't be a generic parameter. E.g., if you want an Iterator<int>
, you have to use an Iterator<Integer>
instead, which will result in a lot of autoboxing and -unboxing if that's backed by an int[]
.
What's the fastest way to loop through an array in JavaScript?
After performing this test with most modern browsers:
https://jsben.ch/wY5fo
Currently, the fastest form of loop (and in my opinion the most syntactically obvious).
A standard for-loop with length caching
var i = 0, len = myArray.length;
while (i < len) {
// your code
i++
}
I would say, this is definitely a case where I applaud JavaScript engine developers. A runtime should be optimized for clarity, not cleverness.
In a java enhanced for loop, is it safe to assume the expression to be looped over will be evaluated only once?
About the enhanced for statement, the Java Language Specifications writes:
The enhanced for statement has the
form:EnhancedForStatement:
for ( VariableModifiersopt Type Identifier: Expression) Statement
The Expression must either have type
Iterable
or else it must be of an
array type (§10.1), or a compile-time
error occurs.The scope of a local variable declared
in the FormalParameter part of an
enhancedfor
statement (§14.14) is
the contained StatementThe meaning of the enhanced
for
statement is given by translation into
a basicfor
statement.If the type of
Expression
is a
subtype ofIterable
, then letI
be
the type of the expression
Expression.iterator()
. The enhancedfor
statement is equivalent
to a basicfor
statement of the
form:for (I #i = Expression.iterator(); #i.hasNext(); ) {
VariableModifiersopt Type Identifier = #i.next();
Statement
}
Where
#i
is a compiler-generated
identifier that is distinct from any
other identifiers (compiler-generated
or otherwise) that are in scope (§6.3)
at the point where the enhanced for
statement occurs.Otherwise, the Expression necessarily
has an array type,T[]
. LetL1 ... Lm
be the (possibly empty) sequence of
labels immediately preceding the
enhancedfor
statement. Then the
meaning of the enhanced for statement
is given by the following basicfor
statement:T[] a = Expression;
L1: L2: ... Lm:
for (int i = 0; i < a.length; i++) {
VariableModifiersopt Type Identifier = a[i];
Statement
}
Where a and i are compiler-generated
identifiers that are distinct from any
other identifiers (compiler-generated
or otherwise) that are in scope at the
point where the enhanced for statement
occurs.
So in your case, genArray()
doesn't return a subtype of Iterable
but an array type, so your enhanced for
statement is equivalent to the following basic for
statement:
String[] a = genArray();
...
for (int i = 0; i < a.length; i++) {
String s = a[i];
// ...
}
And genArray()
will thus be called only once (but the currently accepted answer is partially wrong).
Enhanced for loop with array
From the Java tutorial on this subject:
The for-each construct is also applicable to arrays, where it hides
the index variable rather than the iterator. The following method
returns the sum of the values in an int array:// Returns the sum of the elements of a
int sum(int[] a) {
int result = 0;
for (int i : a)
result += i;
return result;
}
And from §14.14.2 of the JLS (the Java Language Specification):
The enhanced for statement has the form:
EnhancedForStatement:
for ( FormalParameter : Expression ) Statement
The type of the
Expression
must beIterable
or an array type, or a
compile-time error occurs.
But note that arrays don't implement Iterable
; from §10.1 of the JLS:
The direct superclass of an array type is
Object
.Every array type implements the interfaces
Cloneable
andjava.io.Serializable
.
Which is more efficient, a for-each loop, or an iterator?
If you are just wandering over the collection to read all of the values, then there is no difference between using an iterator or the new for loop syntax, as the new syntax just uses the iterator underwater.
If however, you mean by loop the old "c-style" loop:
for(int i=0; i<list.size(); i++) {
Object o = list.get(i);
}
Then the new for loop, or iterator, can be a lot more efficient, depending on the underlying data structure. The reason for this is that for some data structures, get(i)
is an O(n) operation, which makes the loop an O(n2) operation. A traditional linked list is an example of such a data structure. All iterators have as a fundamental requirement that next()
should be an O(1) operation, making the loop O(n).
To verify that the iterator is used underwater by the new for loop syntax, compare the generated bytecodes from the following two Java snippets. First the for loop:
List<Integer> a = new ArrayList<Integer>();
for (Integer integer : a)
{
integer.toString();
}
// Byte code
ALOAD 1
INVOKEINTERFACE java/util/List.iterator()Ljava/util/Iterator;
ASTORE 3
GOTO L2
L3
ALOAD 3
INVOKEINTERFACE java/util/Iterator.next()Ljava/lang/Object;
CHECKCAST java/lang/Integer
ASTORE 2
ALOAD 2
INVOKEVIRTUAL java/lang/Integer.toString()Ljava/lang/String;
POP
L2
ALOAD 3
INVOKEINTERFACE java/util/Iterator.hasNext()Z
IFNE L3
And second, the iterator:
List<Integer> a = new ArrayList<Integer>();
for (Iterator iterator = a.iterator(); iterator.hasNext();)
{
Integer integer = (Integer) iterator.next();
integer.toString();
}
// Bytecode:
ALOAD 1
INVOKEINTERFACE java/util/List.iterator()Ljava/util/Iterator;
ASTORE 2
GOTO L7
L8
ALOAD 2
INVOKEINTERFACE java/util/Iterator.next()Ljava/lang/Object;
CHECKCAST java/lang/Integer
ASTORE 3
ALOAD 3
INVOKEVIRTUAL java/lang/Integer.toString()Ljava/lang/String;
POP
L7
ALOAD 2
INVOKEINTERFACE java/util/Iterator.hasNext()Z
IFNE L8
As you can see, the generated byte code is effectively identical, so there is no performance penalty to using either form. Therefore, you should choose the form of loop that is most aesthetically appealing to you, for most people that will be the for-each loop, as that has less boilerplate code.
Related Topics
Error: Java_Home Is Not Defined Correctly Executing Maven
Threads Configuration Based on No. of Cpu-Cores
How to Convert Set<String> to String[]
Writing Custom Kafka Serializer
Maven "Build Path Specifies Execution Environment J2Se-1.5", Even Though I Changed It to 1.7
How to Achieve Conditional Resource Import in a Spring Xml Context
Retrieve an Image Stored as Blob on a MySQL Db
How to Convert Int to Unsigned Byte and Back
Tomcat7 Starts Too Late on Ubuntu 14.04 X64 [Digitalocean]
How to Solve the "A Generic Array of T Is Created for a Varargs Parameter" Compiler Warning
Is -Djava.Library.Path=... Equivalent to System.Setproperty("Java.Library.Path", ...)
How to Serialize Only the Id of a Child with Jackson
How to Configure Log4J to Log Different Log Levels to Different Files for the Same Logger
Running Selenium Scripts with Jmeter
Is Doing a Lot in Constructors Bad
Jpa, How to Use the Same Class (Entity) to Map Different Tables
Compile Error: Package Javax.Servlet Does Not Exist
Create Java Runtime Image on One Platform for Another Using Jlink