Foreach VS Foreachordered in Java 8 Stream

forEach vs forEachOrdered in Java 8 Stream

Stream.of("AAA","BBB","CCC").parallel().forEach(s->System.out.println("Output:"+s));
Stream.of("AAA","BBB","CCC").parallel().forEachOrdered(s->System.out.println("Output:"+s));

The second line will always output

Output:AAA
Output:BBB
Output:CCC

whereas the first one is not guaranted since the order is not kept. forEachOrdered will processes the elements of the stream in the order specified by its source, regardless of whether the stream is sequential or parallel.

Quoting from forEach Javadoc:

The behavior of this operation is explicitly nondeterministic. For parallel stream pipelines, this operation does not guarantee to respect the encounter order of the stream, as doing so would sacrifice the benefit of parallelism.

When the forEachOrdered Javadoc states (emphasis mine):

Performs an action for each element of this stream, in the encounter order of the stream if the stream has a defined encounter order.

Difference between forEachOrdered() and sequential() methods of Java 8?

listOfIntegers.parallelStream().sequential().forEach() creates a parallel Stream and then converts it to a sequential Stream, so you might as well use listOfIntegers.stream().forEach() instead, and get a sequential Stream in the first place.

listOfIntegers.parallelStream().forEachOrdered(e -> System.out.print(e + " ")) performs the operation on a parallel Stream, but guarantees the elements will be consumed in the encounter order of the Stream (if the Stream has a defined encounter order). However, it can be executed on multiple threads.

I don't see a reason of ever using listOfIntegers.parallelStream().sequential(). If you want a sequential Stream, why create a parallel Stream first?

What is the reason forEach in Java Streams API is unordered?

Defining a method forEach that would preserve order and unordered that would break it, would complicated things IMO; simply because unordered does nothing more than setting a flag in the stream api internals and the flag checking would have to be performed or enforced based on some conditions.

So let's say you would do:

someStream()
.unordered()
.forEach(System.out::println)

In this case, your proposal is to not print elements in any order, thus enforcing unordered here. But what if we did:

someSet().stream()
.unordered()
.forEach(System.out::println)

In this case would you want unordered to be enforced? After all, the source of a stream is a Set, which has no order, so in this case, enforcing unordered is just useless; but this means additional tests on the source of the stream internally. This can get quite tricky and complicated (as it already is btw).

To make it simpler there were two method defined, that clearly stipulate what they will do; and this is on par for example with findFirst vs findAny or even Optional::isPresent and Optional::isEmpty (added in java-11).

Benefit of using forEachOrdered with Parallel streams

depending on the situation, one does not lose all the benefits of parallelism by using ForEachOrdered.

Assume that we have something as such:

stringList.parallelStream().map(String::toUpperCase)
.forEachOrdered(System.out::println);

In this case, we can guarantee that the ForEachOrdered terminal operation will print out the strings in uppercase in the encounter order but we should not assume that the elements will be passed to the map intermediate operation in the same order they were picked for processing. The map operation will be executed by multiple threads concurrently. So one may still benefit from parallelism but it's just that we’re not leveraging the full potential of parallelism. To conclude, we should use ForEachOrdered when it matters to perform an action in the encounter order of the stream.

edit following your comment:

What happens when you skip map operation? I am more interested in
forEachOrdered right after parallelStream()

if you're referring to something as in:

 stringList.parallelStream().forEachOrdered(action);

there is no benefit in doing such thing and I doubt that's what the designers had in mind when they decided to create the method. in such case, it would make more sense to do:

stringList.stream().forEach(action);

to extend on your question "Why would anyone use forEachOrdered with parallel stream if we are losing parallelism", say you wanted to perform an action on each element with respect to the streams encounter order; in such case you will need to use forEachOrdered as the forEach terminal operation is non deterministic when used in parallel hence there is one version for sequential streams and one specifically for parallel streams.

Time difference between forEach() and forEachOrdered in parallel stream

I think you should test this with some big value then the result would be much more clear and there will be a high time difference.

What is difference between Collection.stream().forEach() and Collection.forEach()?

For simple cases such as the one illustrated, they are mostly the same. However, there are a number of subtle differences that might be significant.

One issue is with ordering. With Stream.forEach, the order is undefined. It's unlikely to occur with sequential streams, still, it's within the specification for Stream.forEach to execute in some arbitrary order. This does occur frequently in parallel streams. By contrast, Iterable.forEach is always executed in the iteration order of the Iterable, if one is specified.

Another issue is with side effects. The action specified in Stream.forEach is required to be non-interfering. (See the java.util.stream package doc.) Iterable.forEach potentially has fewer restrictions. For the collections in java.util, Iterable.forEach will generally use that collection's Iterator, most of which are designed to be fail-fast and which will throw ConcurrentModificationException if the collection is structurally modified during the iteration. However, modifications that aren't structural are allowed during iteration. For example, the ArrayList class documentation says "merely setting the value of an element is not a structural modification." Thus, the action for ArrayList.forEach is allowed to set values in the underlying ArrayList without problems.

The concurrent collections are yet again different. Instead of fail-fast, they are designed to be weakly consistent. The full definition is at that link. Briefly, though, consider ConcurrentLinkedDeque. The action passed to its forEach method is allowed to modify the underlying deque, even structurally, and ConcurrentModificationException is never thrown. However, the modification that occurs might or might not be visible in this iteration. (Hence the "weak" consistency.)

Still another difference is visible if Iterable.forEach is iterating over a synchronized collection. On such a collection, Iterable.forEach takes the collection's lock once and holds it across all the calls to the action method. The Stream.forEach call uses the collection's spliterator, which does not lock, and which relies on the prevailing rule of non-interference. The collection backing the stream could be modified during iteration, and if it is, a ConcurrentModificationException or inconsistent behavior could result.



Related Topics



Leave a reply



Submit