Which Is More Efficient, a For-Each Loop, or an Iterator

Which is more efficient, a for-each loop, or an iterator?

If you are just wandering over the collection to read all of the values, then there is no difference between using an iterator or the new for loop syntax, as the new syntax just uses the iterator underwater.

If however, you mean by loop the old "c-style" loop:

for(int i=0; i<list.size(); i++) {
Object o = list.get(i);
}

Then the new for loop, or iterator, can be a lot more efficient, depending on the underlying data structure. The reason for this is that for some data structures, get(i) is an O(n) operation, which makes the loop an O(n2) operation. A traditional linked list is an example of such a data structure. All iterators have as a fundamental requirement that next() should be an O(1) operation, making the loop O(n).

To verify that the iterator is used underwater by the new for loop syntax, compare the generated bytecodes from the following two Java snippets. First the for loop:

List<Integer>  a = new ArrayList<Integer>();
for (Integer integer : a)
{
integer.toString();
}
// Byte code
ALOAD 1
INVOKEINTERFACE java/util/List.iterator()Ljava/util/Iterator;
ASTORE 3
GOTO L2
L3
ALOAD 3
INVOKEINTERFACE java/util/Iterator.next()Ljava/lang/Object;
CHECKCAST java/lang/Integer
ASTORE 2
ALOAD 2
INVOKEVIRTUAL java/lang/Integer.toString()Ljava/lang/String;
POP
L2
ALOAD 3
INVOKEINTERFACE java/util/Iterator.hasNext()Z
IFNE L3

And second, the iterator:

List<Integer>  a = new ArrayList<Integer>();
for (Iterator iterator = a.iterator(); iterator.hasNext();)
{
Integer integer = (Integer) iterator.next();
integer.toString();
}
// Bytecode:
ALOAD 1
INVOKEINTERFACE java/util/List.iterator()Ljava/util/Iterator;
ASTORE 2
GOTO L7
L8
ALOAD 2
INVOKEINTERFACE java/util/Iterator.next()Ljava/lang/Object;
CHECKCAST java/lang/Integer
ASTORE 3
ALOAD 3
INVOKEVIRTUAL java/lang/Integer.toString()Ljava/lang/String;
POP
L7
ALOAD 2
INVOKEINTERFACE java/util/Iterator.hasNext()Z
IFNE L8

As you can see, the generated byte code is effectively identical, so there is no performance penalty to using either form. Therefore, you should choose the form of loop that is most aesthetically appealing to you, for most people that will be the for-each loop, as that has less boilerplate code.

Performance of traditional for loop vs Iterator/foreach in Java

Assuming this is what you meant:

// traditional for loop
for (int i = 0; i < collection.size(); i++) {
T obj = collection.get(i);
// snip
}

// using iterator
Iterator<T> iter = collection.iterator();
while (iter.hasNext()) {
T obj = iter.next();
// snip
}

// using iterator internally (confirm it yourself using javap -c)
for (T obj : collection) {
// snip
}

Iterator is faster for collections with no random access (e.g. TreeSet, HashMap, LinkedList). For arrays and ArrayLists, performance differences should be negligible.

Edit: I believe that micro-benchmarking is root of pretty much evil, just like early optimization. But then again, I think it's good to have a feeling for the implications of such quite trivial things. Hence I've run a small test:

  • iterate over a LinkedList and an ArrayList respecively
  • with 100,000 "random" strings
  • summing up their length (just something to avoid that compiler optimizes away the whole loop)
  • using all 3 loop styles (iterator, for each, for with counter)

Results are similar for all but "for with counter" with LinkedList. All the other five took less than 20 milliseconds to iterate over the whole list. Using list.get(i) on a LinkedList 100,000 times took more than 2 minutes (!) to complete (60,000 times slower). Wow! :) Hence it's best to use an iterator (explicitly or implicitly using for each), especially if you don't know what type and size of list your dealing with.

Should I use an Iterator or a forloop to iterate?

A foreach is equivalent to an iterator--it's syntactic sugar for the same thing. So you should always choose foreach over iterator whenever you can, simply because it's convenient and results in more concise code.

I've written about this in another answer: https://stackoverflow.com/questions/22110482/uses-and-syntax-for-for-each-loop-in-java/22110517#22110517

As stated by @RonnyShapiro, there are situations where you need to use an iterator, but in many cases a foreach should suffice. Note that a foreach is not a normal for loop. The normal for loop is needed when access to the index is required. Although you could manually create a separate index int-variable with foreach it is not ideal, from a variable-scope point of view.

Here's some more information: Which is more efficient, a for-each loop, or an iterator?

When accessing collections, a foreach is significantly faster than the basic for loop's array access. When accessing arrays, however--at least with primitive and wrapper-arrays--access via indexes is way faster. See below.


Indexes are 23-40 percent faster than iterators when accessing int or Integer arrays. Here is the output from the below testing class, which sums the numbers in a 100-element primitive-int array (A is iterator, B is index):

[C:\java_code\]java TimeIteratorVsIndexIntArray 1000000
Test A: 358,597,622 nanoseconds
Test B: 269,167,681 nanoseconds
B faster by 89,429,941 nanoseconds (24.438799231635727% faster)

[C:\java_code\]java TimeIteratorVsIndexIntArray 1000000
Test A: 377,461,823 nanoseconds
Test B: 278,694,271 nanoseconds
B faster by 98,767,552 nanoseconds (25.666236154695838% faster)

[C:\java_code\]java TimeIteratorVsIndexIntArray 1000000
Test A: 288,953,495 nanoseconds
Test B: 207,050,523 nanoseconds
B faster by 81,902,972 nanoseconds (27.844689860906513% faster)

[C:\java_code\]java TimeIteratorVsIndexIntArray 1000000
Test A: 375,373,765 nanoseconds
Test B: 283,813,875 nanoseconds
B faster by 91,559,890 nanoseconds (23.891659337194227% faster)

[C:\java_code\]java TimeIteratorVsIndexIntArray 1000000
Test A: 375,790,818 nanoseconds
Test B: 220,770,915 nanoseconds
B faster by 155,019,903 nanoseconds (40.75164734599769% faster)

[C:\java_code\]java TimeIteratorVsIndexIntArray 1000000
Test A: 326,373,762 nanoseconds
Test B: 202,555,566 nanoseconds
B faster by 123,818,196 nanoseconds (37.437545972215744% faster)

The full testing class:

   import  java.text.NumberFormat;
import java.util.Locale;
/**
<P>{@code java TimeIteratorVsIndexIntArray 1000000}</P>

@see <CODE><A HREF="https://stackoverflow.com/questions/180158/how-do-i-time-a-methods-execution-in-java">https://stackoverflow.com/questions/180158/how-do-i-time-a-methods-execution-in-java</A></CODE>
**/
public class TimeIteratorVsIndexIntArray {
public static final NumberFormat nf = NumberFormat.getNumberInstance(Locale.US);
public static final void main(String[] tryCount_inParamIdx0) {
int testCount;
//Get try-count from command-line parameter
try {
testCount = Integer.parseInt(tryCount_inParamIdx0[0]);
} catch(ArrayIndexOutOfBoundsException | NumberFormatException x) {
throw new IllegalArgumentException("Missing or invalid command line parameter: The number of testCount for each test. " + x);
}

//Test proper...START
int[] intArray = new int[] {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100};

long lStart = System.nanoTime();
for(int i = 0; i < testCount; i++) {
testIterator(intArray);
}
long lADuration = outputGetNanoDuration("A", lStart);

lStart = System.nanoTime();
for(int i = 0; i < testCount; i++) {
testFor(intArray);
}
long lBDuration = outputGetNanoDuration("B", lStart);

outputGetABTestNanoDifference(lADuration, lBDuration, "A", "B");
}
private static final void testIterator(int[] int_array) {
int total = 0;
for(int i = 0; i < int_array.length; i++) {
total += int_array[i];
}
}
private static final void testFor(int[] int_array) {
int total = 0;
for(int i : int_array) {
total += i;
}
}
//Test proper...END

//Timer testing utilities...START
public static final long outputGetNanoDuration(String s_testName, long l_nanoStart) {
long lDuration = System.nanoTime() - l_nanoStart;
System.out.println("Test " + s_testName + ": " + nf.format(lDuration) + " nanoseconds");
return lDuration;
}

public static final long outputGetABTestNanoDifference(long l_aDuration, long l_bDuration, String s_aTestName, String s_bTestName) {
long lDiff = -1;
double dPct = -1.0;
String sFaster = null;
if(l_aDuration > l_bDuration) {
lDiff = l_aDuration - l_bDuration;
dPct = 100.00 - (l_bDuration * 100.0 / l_aDuration + 0.5);
sFaster = "B";
} else {
lDiff = l_bDuration - l_aDuration;
dPct = 100.00 - (l_aDuration * 100.0 / l_bDuration + 0.5);
sFaster = "A";
}
System.out.println(sFaster + " faster by " + nf.format(lDiff) + " nanoseconds (" + dPct + "% faster)");
return lDiff;
}
//Timer testing utilities...END
}

I also ran this for an Integer array, and indexes are still the clear winner, but only between 18 and 25 percent faster.

For a List of Integers, however, iterators are faster. Just change the int-array in the above code to

List<Integer> intList = Arrays.asList(new Integer[] {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100});

and make the necssary changes to the test-function (int[] to List<Integer>, length to size(), etc)

[C:\java_code\]java TimeIteratorVsIndexIntegerList 1000000
Test A: 3,429,929,976 nanoseconds
Test B: 5,262,782,488 nanoseconds
A faster by 1,832,852,512 nanoseconds (34.326681820485675% faster)

[C:\java_code\]java TimeIteratorVsIndexIntegerList 1000000
Test A: 2,907,391,427 nanoseconds
Test B: 3,957,718,459 nanoseconds
A faster by 1,050,327,032 nanoseconds (26.038700083921256% faster)

[C:\java_code\]java TimeIteratorVsIndexIntegerList 1000000
Test A: 2,566,004,688 nanoseconds
Test B: 4,221,746,521 nanoseconds
A faster by 1,655,741,833 nanoseconds (38.71935684115413% faster)

[C:\java_code\]java TimeIteratorVsIndexIntegerList 1000000
Test A: 2,770,945,276 nanoseconds
Test B: 3,829,077,158 nanoseconds
A faster by 1,058,131,882 nanoseconds (27.134122749113843% faster)

[C:\java_code\]java TimeIteratorVsIndexIntegerList 1000000
Test A: 3,467,474,055 nanoseconds
Test B: 5,183,149,104 nanoseconds
A faster by 1,715,675,049 nanoseconds (32.60101667104192% faster)

[C:\java_code\]java TimeIteratorVsIndexIntList 1000000
Test A: 3,439,983,933 nanoseconds
Test B: 3,509,530,312 nanoseconds
A faster by 69,546,379 nanoseconds (1.4816434912159906% faster)

[C:\java_code\]java TimeIteratorVsIndexIntList 1000000
Test A: 3,451,101,466 nanoseconds
Test B: 5,057,979,210 nanoseconds
A faster by 1,606,877,744 nanoseconds (31.269164666060377% faster)

In one test they're almost equivalent, but still, iterator wins.

For-each vs Iterator. Which will be the better option

for-each is syntactic sugar for using iterators (approach 2).

You might need to use iterators if you need to modify collection in your loop. First approach will throw exception.

for (String i : list) {
System.out.println(i);
list.remove(i); // throws exception
}

Iterator it=list.iterator();
while (it.hasNext()){
System.out.println(it.next());
it.remove(); // valid here
}

Iterator vs for

First of all, there are 2 kinds of for loops, which behave very differently. One uses indices:

for (int i = 0; i < list.size(); i++) {
Thing t = list.get(i);
...
}

This kind of loop isn't always possible. For example, Lists have indices, but Sets don't, because they're unordered collections.

The other one, the foreach loop uses an Iterator behind the scenes:

for (Thing thing : list) {
...
}

This works with every kind of Iterable collection (or array)

And finally, you can use an Iterator, which also works with any Iterable:

for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
Thing t = it.next();
...
}

So you in fact have 3 loops to compare.

You can compare them in different terms: performance, readability, error-proneness, capability.

An Iterator can do things that a foreach loop can't. For example, you can remove elements while you're iterating, if the iterator supports it:

for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
Thing t = it.next();
if (shouldBeDeleted(thing) {
it.remove();
}
}

Lists also offer iterators that can iterate in both directions. A foreach loop only iterates from the beginning to an end.

But an Iterator is more dangerous and less readable. When a foreach loop is all you need, it's the most readable solution. With an iterator, you could do the following, which would be a bug:

for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
System.out.println(it.next().getFoo());
System.out.println(it.next().getBar());
}

A foreach loop doesn't allow for such a bug to happen.

Using indices to access elements is slightly more efficient with collections backed by an array. But if you change your mind and use a LinkedList instead of an ArrayList, suddenly the performance will be awful, because each time you access list.get(i), the linked list will have to loop though all its elements until the ith one. An Iterator (and thus the foreach loop) doesn't have this problem. It always uses the best possible way to iterate through elements of the given collection, because the collection itself has its own Iterator implementation.

My general rule of thumb is: use the foreach loop, unless you really need capabilities of an Iterator. I would only use for loop with indices with arrays, when I need access to the index inside the loop.

why is the enhanced for loop more efficient than the normal for loop

It's a bit of an oversimplification to say that the enhanced for loop is more efficient. It can be, but in many cases it's almost exactly the same as an old-school loop.

The first thing to note is that for collections the enhanced for loop uses an Iterator, so if you manually iterate over a collection using an Iterator then you should have pretty much the same performance than the enhanced for loop.

One place where the enhanced for loop is faster than a naively implemented traditional loop is something like this:

LinkedList<Object> list = ...;

// Loop 1:
int size = list.size();
for (int i = 0; i<size; i++) {
Object o = list.get(i);
/// do stuff
}

// Loop 2:
for (Object o : list) {
// do stuff
}

// Loop 3:
Iterator<Object> it = list.iterator();
while (it.hasNext()) {
Object o = it.next();
// do stuff
}

In this case Loop 1 will be slower than both Loop 2 and Loop 3, because it will have to (partially) traverse the list in each iteration to find the element at position i. Loop 2 and 3, however will only ever step one element further in the list, due to the use of an Iterator. Loop 2 and 3 will also have pretty much the same performance since Loop 3 is pretty much exactly what the compiler will produce when you write the code in Loop 2.

Is there a performance difference between a for loop and a for-each loop?

From Item 46 in Effective Java by Joshua Bloch :

The for-each loop, introduced in
release 1.5, gets rid of the clutter
and the opportunity for error by
hiding the iterator or index variable
completely. The resulting idiom
applies equally to collections and
arrays:

// The preferred idiom for iterating over collections and arrays
for (Element e : elements) {
doSomething(e);
}

When you see the colon (:), read it as
“in.” Thus, the loop above reads as
“for each element e in elements.” Note
that there is no performance penalty
for using the for-each loop, even for
arrays. In fact, it may offer a slight
performance advantage over an ordinary
for loop in some circumstances, as it
computes the limit of the array index
only once. While you can do this by
hand (Item 45), programmers don’t
always do so.

Java 8 Iterable.forEach() vs foreach loop

The better practice is to use for-each. Besides violating the Keep It Simple, Stupid principle, the new-fangled forEach() has at least the following deficiencies:

  • Can't use non-final variables. So, code like the following can't be turned into a forEach lambda:
Object prev = null;
for(Object curr : list)
{
if( prev != null )
foo(prev, curr);
prev = curr;
}
  • Can't handle checked exceptions. Lambdas aren't actually forbidden from throwing checked exceptions, but common functional interfaces like Consumer don't declare any. Therefore, any code that throws checked exceptions must wrap them in try-catch or Throwables.propagate(). But even if you do that, it's not always clear what happens to the thrown exception. It could get swallowed somewhere in the guts of forEach()

  • Limited flow-control. A return in a lambda equals a continue in a for-each, but there is no equivalent to a break. It's also difficult to do things like return values, short circuit, or set flags (which would have alleviated things a bit, if it wasn't a violation of the no non-final variables rule). "This is not just an optimization, but critical when you consider that some sequences (like reading the lines in a file) may have side-effects, or you may have an infinite sequence."

  • Might execute in parallel, which is a horrible, horrible thing for all but the 0.1% of your code that needs to be optimized. Any parallel code has to be thought through (even if it doesn't use locks, volatiles, and other particularly nasty aspects of traditional multi-threaded execution). Any bug will be tough to find.

  • Might hurt performance, because the JIT can't optimize forEach()+lambda to the same extent as plain loops, especially now that lambdas are new. By "optimization" I do not mean the overhead of calling lambdas (which is small), but to the sophisticated analysis and transformation that the modern JIT compiler performs on running code.

  • If you do need parallelism, it is probably much faster and not much more difficult to use an ExecutorService. Streams are both automagical (read: don't know much about your problem) and use a specialized (read: inefficient for the general case) parallelization strategy (fork-join recursive decomposition).

  • Makes debugging more confusing, because of the nested call hierarchy and, god forbid, parallel execution. The debugger may have issues displaying variables from the surrounding code, and things like step-through may not work as expected.

  • Streams in general are more difficult to code, read, and debug. Actually, this is true of complex "fluent" APIs in general. The combination of complex single statements, heavy use of generics, and lack of intermediate variables conspire to produce confusing error messages and frustrate debugging. Instead of "this method doesn't have an overload for type X" you get an error message closer to "somewhere you messed up the types, but we don't know where or how." Similarly, you can't step through and examine things in a debugger as easily as when the code is broken into multiple statements, and intermediate values are saved to variables. Finally, reading the code and understanding the types and behavior at each stage of execution may be non-trivial.

  • Sticks out like a sore thumb. The Java language already has the for-each statement. Why replace it with a function call? Why encourage hiding side-effects somewhere in expressions? Why encourage unwieldy one-liners? Mixing regular for-each and new forEach willy-nilly is bad style. Code should speak in idioms (patterns that are quick to comprehend due to their repetition), and the fewer idioms are used the clearer the code is and less time is spent deciding which idiom to use (a big time-drain for perfectionists like myself!).

As you can see, I'm not a big fan of the forEach() except in cases when it makes sense.

Particularly offensive to me is the fact that Stream does not implement Iterable (despite actually having method iterator) and cannot be used in a for-each, only with a forEach(). I recommend casting Streams into Iterables with (Iterable<T>)stream::iterator. A better alternative is to use StreamEx which fixes a number of Stream API problems, including implementing Iterable.

That said, forEach() is useful for the following:

  • Atomically iterating over a synchronized list. Prior to this, a list generated with Collections.synchronizedList() was atomic with respect to things like get or set, but was not thread-safe when iterating.

  • Parallel execution (using an appropriate parallel stream). This saves you a few lines of code vs using an ExecutorService, if your problem matches the performance assumptions built into Streams and Spliterators.

  • Specific containers which, like the synchronized list, benefit from being in control of iteration (although this is largely theoretical unless people can bring up more examples)

  • Calling a single function more cleanly by using forEach() and a method reference argument (ie, list.forEach (obj::someMethod)). However, keep in mind the points on checked exceptions, more difficult debugging, and reducing the number of idioms you use when writing code.

Articles I used for reference:

  • Everything about Java 8
  • Iteration Inside and Out (as pointed out by another poster)

EDIT: Looks like some of the original proposals for lambdas (such as http://www.javac.info/closures-v06a.html Google Cache) solved some of the issues I mentioned (while adding their own complications, of course).



Related Topics



Leave a reply



Submit