Java 8 Nested (Multi Level) Group By

Java 8 Nested (Multi level) group by

You can’t group a single item by multiple keys, unless you accept the item to potentially appear in multiple groups. In that case, you want to perform a kind of flatMap operation.

One way to achieve this, is to use Stream.flatMap with a temporary pair holding the combinations of Item and SubItem before collecting. Due to the absence of a standard pair type, a typical solution is to use Map.Entry for that:

Map<T, Map<V, List<SubItem>>> result = pojo.getItems().stream()
    .flatMap(item -> item.subItems.stream()
        .map(sub -> new AbstractMap.SimpleImmutableEntry<>(item.getKey1(), sub)))
    .collect(Collectors.groupingBy(AbstractMap.SimpleImmutableEntry::getKey,
                Collectors.mapping(Map.Entry::getValue,
                    Collectors.groupingBy(SubItem::getKey2))));

An alternative not requiring these temporary objects would be performing the flatMap operation right in the collector, but unfortunately, flatMapping won't be there until Java 9.

With that, the solution would look like

Map<T, Map<V, List<SubItem>>> result = pojo.getItems().stream()
    .collect(Collectors.groupingBy(Item::getKey1,
                Collectors.flatMapping(item -> item.getSubItems().stream(),
                    Collectors.groupingBy(SubItem::getKey2))));

and if we don’t want to wait for Java 9 for that, we may add a similar collector to our code base, as it’s not so hard to implement:

static <T,U,A,R> Collector<T,?,R> flatMapping(
    Function<? super T,? extends Stream<? extends U>> mapper,
    Collector<? super U,A,R> downstream) {

    BiConsumer<A, ? super U> acc = downstream.accumulator();
    return Collector.of(downstream.supplier(),
        (a, t) -> { try(Stream<? extends U> s=mapper.apply(t)) {
            if(s!=null) s.forEachOrdered(u -> acc.accept(a, u));
        }},
        downstream.combiner(), downstream.finisher(),
        downstream.characteristics().toArray(new Collector.Characteristics[0]));
}

Collectors nested grouping-by with multiple fields

If you can use Java 9 or higher, you can use Collectors.flatMapping() to achieve that:

Map<String, Map<String, Long>> eventList = list.stream()
        .collect(Collectors.groupingBy(MyObject::getSite, Collectors.flatMapping(
                o -> Stream.of(o.getSource(), o.getSeverity()),
                Collectors.groupingBy(Function.identity(), Collectors.counting())
        )));

The result will be this:

{
    USA={maint=2, HARMLESS=2}, 
    GERMANY={CPU_Checker=1, MINOR=1}
}

If you are not able to use Java 9 you can implement the flatMapping() function yourself. You can take a look at Java 9 Collectors.flatMapping rewritten in Java 8, which should help you with that.

How to create a nested Map using Collectors.groupingBy?

Abstract / Brief discussion

Having a map of maps of maps is questionable when seen from an object-oriented prespective, as it might seem that you're lacking some abstraction (i.e. you could create a class Result that encapsulates the results of the nested grouping). However, it's perfectly reasonable when considered exclusively from a pure data-oriented approach.

So here I present two approaches: the first one is purely data-oriented (with nested groupingBy calls, hence nested maps), while the second one is more OO-friendly and makes a better job at abstracting the grouping criteria. Just pick the one which better represents your intentions and coding standards/traditions and, more importantly, the one you most like.

Data-oriented approach

For the first approach, you can just nest the groupingBy calls:

Map<String, Map<String, Map<String, List<Booker>>>> result = list.stream()
    .collect(Collectors.groupingBy(ProductDto::getStatus,
             Collectors.groupingBy(ProductDto::getCategory,
             Collectors.groupingBy(ProductDto::getType,
                 Collectors.mapping(
                         ProductDto::getBooker,
                         Collectors.toList())))));

As you see, the result is a Map<String, Map<String, Map<String, List<Booker>>>>. This is because there might be more than one ProductDto instance with the same (status, category, type) combination.

Also, as you need Booker instances instead of ProductDto instances, I'm adapting the last groupingBy collector so that it returns Bookers instead of productDtos.

About reduction

If you need to have only one Booker instance instead of a List<Booker> as the value of the innermost map, you would need a way to reduce Booker instances, i.e. convert many instances into one by means of an associative operation (accumulating the sum of some attribute being the most common one).

Object-oriented friendly approach

For the second approach, having a Map<String, Map<String, Map<String, List<Booker>>>> might be seen as bad practice or even as pure evil. So, instead of having a map of maps of maps of lists, you could have only one map of lists whose keys represent the combination of the 3 properties you want to group by.

The easiest way to do this is to use a List as the key, as lists already provide hashCode and equals implementations:

Map<List<String>, List<Booker>> result = list.stream()
    .collect(Collectors.groupingBy(
         dto -> Arrays.asList(dto.getStatus(), dto.getCategory(), dto.getType()),
         Collectors.mapping(
                 ProductDto::getBooker,
                 Collectors.toList())))));

If you are on Java 9+, you can use List.of instead of Arrays.asList, as List.of returns a fully immutable and highly optimized list.

Java 8 multilevel grouping and reducing

is there any reason you made the name in EmployeeInfo final? if you can change that this solution will work

add these two methods to EmployeeInfo

public void setName(String name) {
    this.name = name;
}
public void AddAccount(String account) {
    this.accounts.add(account);
}

and then you can do this

Collector<Employee, EmployeeInfo, EmployeeInfo> empToInfo = Collector.of(
     () -> new EmployeeInfo("", new ArrayList<String>()),
            (info, e) -> { 
                info.AddAccount(e.getAccount());
                info.setName(e.getName());
                },
     (p1,p2) -> p1.addToList(p2));

Collector<Employee, ?, Collection<EmployeeInfo>> byName = collectingAndThen(groupingBy(Employee::getName, empToInfo), 
                  (Map<String, EmployeeInfo> finisher) -> {return finisher.values();});

Map<Integer, Collection<EmployeeInfo>> r2 = employees.stream().collect(groupingBy(Employee::getDept, byName));

if you want to keep The EmployeeInfo immutable, you can use reduction instead of collection and it will be like this

Map<Integer, Collection<EmployeeInfo>> result2 = employees.stream().collect(groupingBy(Employee::getDept,
             collectingAndThen(groupingBy(Employee::getName, reducing(new EmployeeInfo("", new ArrayList<String>()), 
                                                                      empl3 -> new EmployeeInfo(empl3.getName(),Arrays.asList(empl3.getAccount())), 
                                                                      (inf1, inf2) -> inf1.addToList(inf2))), 
                                finisher -> finisher.values())));

Group By in java 8 with multiple levels of grouping

Based on the grouping requirement, you should try using nested grouping along with Collectors.mapping such as:

Map<String, Map<String, List<String>>> groupingRequirement = workersList.stream()
        .collect(Collectors.groupingBy(Workers::getDirectorName,
                Collectors.groupingBy(Workers::getManagerName,
                        Collectors.mapping(Workers::getEmployeeName,
                                Collectors.toList()))));

Thereafter mapping to objects of the desired type is the only constraint left while you iterated over the entries of the collected Map -

List<WorkersResponse> workersResponses = groupingRequirement.entrySet().stream()
        .map(e -> new WorkersResponse(e.getKey(), // director name
                e.getValue().entrySet()
                        .stream()
                        .map(ie -> new Manager(ie.getKey(), // manager name
                                ie.getValue()
                                        .stream()
                                        .map(Employee::new)
                                        .collect(Collectors.toList())))
                        .collect(Collectors.toList())))
        .collect(Collectors.toList());

How to use groupingBy of stream API for multi level nested classes?

As you mentioned in the old question you referenced, can use the flat map approach where each nesting level is flat mapped down to the level required as follows:

    Map<Triple<Long, Long, Long>, Double> result = allStudents.stream()
            .flatMap(s -> s.getCourses().stream().map(
                    c -> ImmutableTuple.of(s.getStudentId(), c)))
            .flatMap(sc -> sc.get().getTasks().stream().map(
                    t -> ImmutableTuple.of(sc.getFirst(), sc.get().getCourseId(), t)))
            .flatMap(sct -> sct.get().getAssessments().stream().map(
                    a -> ImmutableTuple.of(sct.getFirst(), sct.getSecond(), sct.get().taskId, a.getScore())))
            .collect(Collectors.groupingBy(
                    ImmutableQuadruple::remove,
                    Collectors.summingDouble(ImmutableQuadruple::get)
            ));

Note: This is using tuples from the typedtuples library

Java 8 nested groupingby

Try the following:

public Map<String, List<Map<String, String>>> doGrouping(
        List<String> columns,
        List<Map<String, String>> data) {

    return data.stream()
        .collect(Collectors.groupingBy(
            elem -> columns.stream()
                .map(elem::get)
                .collect(Collectors.joining())));
}

First, I streamed the data, which is a list of maps. I immediately collected the stream to a map of lists using Collectors.groupingBy with a key that is calculated for each element of the stream.

Calculating the key was the tricky part. For this, I streamed the given list of columns and I transformed each one of these columns into its corresponding value of the element of the stream. I did this by means of the Stream.map method, passing elem::map as the mapping function. Finally, I collected this inner stream into a single string by using Collectors.joining, which concatenates each element of the stream into a final string in an efficient manner.

Edit: The code above works well if all the elements of columns exist as keys of the map elements in data. To be more secure use the following:

return data.stream()
    .collect(Collectors.groupingBy(
        elem -> columns.stream()
            .map(elem::get)
            .filter(Objects::nonNull)
            .collect(Collectors.joining())));

This version filters out null elements from the stream, which might occur if some map element does not contain a key specified in the columns list.

Using groupingBy into a nested Map, but collecting to a different type of object

With a custom collector like so:

private static Collector<Collection<SomeClassB>, ?, ImmutableList<SomeClassB>>
        flatMapToImmutableList() {
        return Collectors.collectingAndThen(Collectors.toList(),
                listOfCollectionsOfB ->
                        listOfCollectionsOfB.stream()
                                .flatMap(Collection::stream)
                                .collect(GuavaCollectors.toImmutableList()));
    }

you can achieve what you're after:

Map<String, Map<String, List<SomeClassB>>> someMap =
                someListOfClassA.stream()
                        .filter(...)
                        .collect(Collectors.groupingBy(SomeClassA::getSomeCriteriaA,
                                Collectors.groupingBy(SomeClassA::getSomeCriteriaB,
                                        Collectors.mapping(a -> getSomeClassBsFromSomeClassA(a),
                                                flatMapToImmutableList()))));

Java Streams - group-by and return a Nested Map

Is there any direct or simple way to do this, like use Collectors.toMap API or something else?

If you want to utilize only built-in collectors, you might try a combination of groupingBy() and teeing().

Collectors.teeing() expects three arguments: 2 downstream collectors and a merger function. Each element from the stream will be passed into both collectors, and when these collectors are done, results produced by them will get merged by the function.

In the code below, toMap() is used as both downstream collectors of teeing(). And each of these collectors is responsible for retrieving its type of value.

The code might look like that:

public static void main(String[] args) {
    List<Unit> list =
        List.of(Unit.of("a", 2021, 10,  11 ),
                Unit.of("a", 2022, 15,  13),
                Unit.of("b", 2021, 20,  25),
                Unit.of("b", 2022, 30,  37));

    Map<String, Map<String, Integer>> map = list.stream()
        .collect(Collectors.groupingBy(Unit::getUnitId,
            Collectors.teeing(
                Collectors.toMap(
                    unit -> unit.getYear() + "_value1",
                    Unit::getValue1),
            Collectors.toMap(
                    unit -> unit.getYear() + "_value2",
                    Unit::getValue2),
                (values1, values2) -> {values1.putAll(values2); return values1;})
        ));

    printMap(map);
}

Output:

a: {2022_value2: 13, 2021_value1: 10, 2022_value1: 15, 2021_value2: 11}
b: {2022_value2: 37, 2021_value1: 20, 2022_value1: 30, 2021_value2: 25}

Note:

If performance is concerned, Collector.of() would be slightly better because it doesn't create intermediate collections.
For this approach to work correctly (I mean the code listed above as well as in the question), each combination of unitId and year should be unique. Otherwise, consider adding a logic for resolving duplicates.

Java 8 Nested (Multi Level) Group By