What's the difference between map() and flatMap() methods in Java 8?
Both map
and flatMap
can be applied to a Stream<T>
and they both return a Stream<R>
. The difference is that the map
operation produces one output value for each input value, whereas the flatMap
operation produces an arbitrary number (zero or more) values for each input value.
This is reflected in the arguments to each operation.
The map
operation takes a Function
, which is called for each value in the input stream and produces one result value, which is sent to the output stream.
The flatMap
operation takes a function that conceptually wants to consume one value and produce an arbitrary number of values. However, in Java, it's cumbersome for a method to return an arbitrary number of values, since methods can return only zero or one value. One could imagine an API where the mapper function for flatMap
takes a value and returns an array or a List
of values, which are then sent to the output. Given that this is the streams library, a particularly apt way to represent an arbitrary number of return values is for the mapper function itself to return a stream! The values from the stream returned by the mapper are drained from the stream and are passed to the output stream. The "clumps" of values returned by each call to the mapper function are not distinguished at all in the output stream, thus the output is said to have been "flattened."
Typical use is for the mapper function of flatMap
to return Stream.empty()
if it wants to send zero values, or something like Stream.of(a, b, c)
if it wants to return several values. But of course any stream can be returned.
What is the difference between Optional.flatMap and Optional.map?
Use map
if the function returns the object you need or flatMap
if the function returns an Optional
. For example:
public static void main(String[] args) {
Optional<String> s = Optional.of("input");
System.out.println(s.map(Test::getOutput));
System.out.println(s.flatMap(Test::getOutputOpt));
}
static String getOutput(String input) {
return input == null ? null : "output for " + input;
}
static Optional<String> getOutputOpt(String input) {
return input == null ? Optional.empty() : Optional.of("output for " + input);
}
Both print statements print the same thing.
map vs flatMap in reactor
map
is for synchronous, non-blocking, 1-to-1 transformationsflatMap
is for asynchronous (non-blocking) 1-to-N transformations
The difference is visible in the method signature:
map
takes aFunction<T, U>
and returns aFlux<U>
flatMap
takes aFunction<T, Publisher<V>>
and returns aFlux<V>
That's the major hint: you can pass a Function<T, Publisher<V>>
to a map
, but it wouldn't know what to do with the Publishers
, and that would result in a Flux<Publisher<V>>
, a sequence of inert publishers.
On the other hand, flatMap
expects a Publisher<V>
for each T
. It knows what to do with it: subscribe to it and propagate its elements in the output sequence. As a result, the return type is Flux<V>
: flatMap
will flatten each inner Publisher<V>
into the output sequence of all the V
s.
About the 1-N aspect:
for each <T>
input element, flatMap
maps it to a Publisher<V>
. In some cases (eg. an HTTP request), that publisher will emit only one item, in which case we're pretty close to an async map
.
But that's the degenerate case. The generic case is that a Publisher
can emit multiple elements, and flatMap
works just as well.
For an example, imagine you have a reactive database and you flatMap from a sequence of user IDs, with a request that returns a user's set of Badge
. You end up with a single Flux<Badge>
of all the badges of all these users.
Is map
really synchronous and non-blocking?
Yes: it is synchronous in the way the operator applies it (a simple method call, and then the operator emits the result) and non-blocking in the sense that the function itself shouldn't block the operator calling it. In other terms it shouldn't introduce latency. That's because a Flux
is still asynchronous as a whole. If it blocks mid-sequence, it will impact the rest of the Flux
processing, or even other Flux
.
If your map function is blocking/introduces latency but cannot be converted to return a Publisher
, consider publishOn
/subscribeOn
to offset that blocking work on a separate thread.
What is the difference between map and flatMap and a good use case for each?
Here is an example of the difference, as a spark-shell
session:
First, some data - two lines of text:
val rdd = sc.parallelize(Seq("Roses are red", "Violets are blue")) // lines
rdd.collect
res0: Array[String] = Array("Roses are red", "Violets are blue")
Now, map
transforms an RDD of length N into another RDD of length N.
For example, it maps from two lines into two line-lengths:
rdd.map(_.length).collect
res1: Array[Int] = Array(13, 16)
But flatMap
(loosely speaking) transforms an RDD of length N into a collection of N collections, then flattens these into a single RDD of results.
rdd.flatMap(_.split(" ")).collect
res2: Array[String] = Array("Roses", "are", "red", "Violets", "are", "blue")
We have multiple words per line, and multiple lines, but we end up with a single output array of words
Just to illustrate that, flatMapping from a collection of lines to a collection of words looks like:
["aa bb cc", "", "dd"] => [["aa","bb","cc"],[],["dd"]] => ["aa","bb","cc","dd"]
The input and output RDDs will therefore typically be of different sizes for flatMap
.
If we had tried to use map
with our split
function, we'd have ended up with nested structures (an RDD of arrays of words, with type RDD[Array[String]]
) because we have to have exactly one result per input:
rdd.map(_.split(" ")).collect
res3: Array[Array[String]] = Array(
Array(Roses, are, red),
Array(Violets, are, blue)
)
Finally, one useful special case is mapping with a function which might not return an answer, and so returns an Option
. We can use flatMap
to filter out the elements that return None
and extract the values from those that return a Some
:
val rdd = sc.parallelize(Seq(1,2,3,4))
def myfn(x: Int): Option[Int] = if (x <= 2) Some(x * 10) else None
rdd.flatMap(myfn).collect
res3: Array[Int] = Array(10,20)
(noting here that an Option behaves rather like a list that has either one element, or zero elements)
FlatMap vs Filter, Map Java
I, along with the Java language architects, agree with you that Stream#flatMap
for a Stream<Optional<T>>
isn't readable in Java 8, which is why they introduced Optional#stream
in Java 9.
Using this, your code becomes much more readable:
.flatMap(Optional::stream)
Java 8: Difference between map and flatMap for null-checking style
The difference is only that one will return Optional<?>
and the other will return Optional<Optional<?>>
(replace ?
with the return type of processing()
). Since you're discarding the return type, there's no difference.
But it's best to avoid the mapping functions, which by convention should avoid side-effects, and instead use the more idiomatic ifPresent()
:
person.ifPresent(p -> car.ifPresent(c -> processing(p, c)));
This also works if processing()
has a void
return type, which isn't the case with a mapping function.
Does flatMap method only flattens the stream and not map it?
It's only those specific method references that limit you. Using a lambda expression, you can still do both:
.flatMap(list -> list.stream().map(String::toUpperCase))
.collect(Collectors.toList());
I should mention that it's only that sequence of method references that limited you, not that it's impossible to do it with method references. The key is the mapper you pass to it. Take these as examples:
Stream<String> uppercaseArray(String[] a) {
return Arrays.stream(a).map(String::toUpperCase);
}
Stream.of(new String[] {"ab"}, new String[] {"cd", "ef"})
.flatMap(this::uppercaseArray); //map and transform
// Or a completely different perspective
Stream.of("abc", "def").flatMapToInt(String::chars);
Java 8 Streams Shallow copy of map object, cross join using streams
You can use flatMap()
with a function that generates a stream of new maps, much like the loop version does, and collect everything back into a list. Your stream version modifies existing maps in-place, and keeps overwriting previously added "4"
elements with new ones.
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class Demo {
public static void main(String[] args) {
List<Map<String, String>> listOfMap =
List.of(Map.of("1", "a", "2", "b", "3", "c"),
Map.of("1", "d", "2", "e", "3", "f"));
List<String> stringList = List.of("x", "y", "z");
List<Map<String, String>> result =
listOfMap.stream()
.flatMap(map -> stringList.stream().map(elem -> {
Map<String, String> newMap = new HashMap<>(map);
newMap.put("4", elem);
return newMap;
}))
.collect(Collectors.toList());
for (Map<String, String> elem : result) {
System.out.println(elem);
}
}
}
outputs
{1=a, 2=b, 3=c, 4=x}
{1=a, 2=b, 3=c, 4=y}
{1=a, 2=b, 3=c, 4=z}
{1=d, 2=e, 3=f, 4=x}
{1=d, 2=e, 3=f, 4=y}
{1=d, 2=e, 3=f, 4=z}
Related Topics
Java: How to Split an Arraylist in Multiple Small Arraylists
Strange Floating-Point Behaviour in a Java Program
How to Put Axis on a .Png File in Java
Getting Random Numbers in Java
How to Get a Unique Computer Identifier in Java (Like Disk Id or Motherboard Id)
How to Automatically Generate N "Distinct" Colors
How to Use Jaroutputstream to Create a Jar File
Fastest Way to Write Huge Data in Text File Java
Javafx: Location Is Not Set Error
How to Get Java 11 Run-Time Environment Working Since There Is No More Jre 11 for Download
Abstracttablemodel Gui Display Issue
Adding Elements to a Collection During Iteration
How to Pass a Parameter to a Java Thread