Java 8, Streams to Find the Duplicate Elements

How to check if exists any duplicate in Java 8 Streams?

Your code would need to iterate over all elements. If you want to make sure that there are no duplicates simple method like

public static <T> boolean areAllUnique(List<T> list){
Set<T> set = new HashSet<>();

for (T t: list){
if (!set.add(t))
return false;
}

return true;
}

would be more efficient since it can give you false immediately when first non-unique element would be found.

This method could also be rewritten using Stream#allMatch which also is short-circuit (returns false immediately for first element which doesn't fulfill provided condition)

(assuming non-parallel streams and thread-safe environment)

public static <T> boolean areAllUnique(List<T> list){
Set<T> set = new HashSet<>();
return list.stream().allMatch(t -> set.add(t));
}

which can be farther shortened as @Holger pointed out in comment

public static <T> boolean areAllUnique(List<T> list){
return list.stream().allMatch(new HashSet<>()::add);
}

Find Duplicated Elements in a List of Integer without using distinct() method

You can do this using frequency collectors. Have not looked at optimization , but this will not use distinct

import java.util.Set;
import java.util.List;
import java.util.ArrayList;
import static java.util.stream.Collectors.toSet;
import java.util.Collections;

public class DetectDuplicates{
public static void main(String args[]) {
List<Integer> companyIds = new ArrayList<Integer>();
companyIds.add(1);
companyIds.add(1);
companyIds.add(2);
companyIds.add(3);
companyIds.add(3);
Set<Integer> duplicateCompanies = companyIds
.stream()
.filter(company -> Collections.frequency(companyIds, company) > 1)
.collect(toSet());
System.out.println("Duplicate companies " + duplicateCompanies);
}
}

This will print

Duplicate companies [1, 3]

How to remove Duplicated elements from a List based on Two properties using Java 8 streams?

One possibility to do what we want is to

  • collect all results in some Collection<ExampleData>,
  • construct a TreeSet<ExampleData> with a suiting Comparator<? super ExampleData> such that duplicates wrt. our specification are "equal" (they will be automatically filtered out by the set), and
  • add all elements from the Collection to the TreeSet.

A stream-based implementation may look like this:

final TreeSet<ExampleObject> deduped = objects.stream()
.collect(Collectors.toCollection(() -> new TreeSet<>(
Comparator.comparing(ExampleObject::getName)
.thenComparing(ExampleObject::getValue))));

Ideone demo

If we do not like the stream-based approach, we can also solve this with a "traditional", imperative approach:

final TreeSet<ExampleObject> deduped = new TreeSet<>(
Comparator.comparing(ExampleObject::getName)
.thenComparing(ExampleObject::getValue));
deduped.addAll(objects);

Ideone demo


A word on performance:

The deduplication does not come "for free". In the solution provided, we pay for it with execution time. TreeSet is an ordered data structure, thus each insert has time complexity O(log(n)). It then follows that constructing a set of size n has time complexity O(n log(n)).

Find duplicates in two lists Listint[] using Streams

Given your two lists:

List<int[]> keys = // ....
List<int[]> phaseKey = //...

You just need to filter to find common arrays in both lists:

List<int[]> duplicates = keys.stream()
.filter(k -> phaseKey.stream().anyMatch(p -> Arrays.equals(p,k)))
.collect(Collectors.toList());

How to find the duplicates values in a Map with in a stream of list?

Holger in comments brings a point I missed. If your Map contains only of those name and age properties, you could simply do:

fileDataList.stream()
.distinct()
.collect(Collectors.toList())

And this will be enough. If, on the other hand, you have more properties and
what to filter by only some of them, you could use this utility:

fileDataList.stream()
.filter(distinctByKey(x -> Arrays.asList(x.get("name"), x.get("age")))
.collect(Collectors.toList());

obtaining unique number from a list of duplicate integers using java 8 streams

Your filter rejects the first occurrence of each element and accepts all subsequent occurrences. Therefore, when an element occurs n times, you’ll add it n-1 times.

Since you want to accept all elements which occur more than once, but only accept them a single time, you could use .filter(n -> !setOfNmums.add(n)) .distinct() or you enhance the set to a map, to be able to accept an element only on its second occurrence.

Map<Integer, Integer> occurrences = new HashMap<>();
List<String> result = Stream.of(5,6,7,7,7,6,2,4,2,4)
.filter(n -> occurrences.merge(n, 1, Integer::sum) == 2)
.map(String::valueOf)
.sorted()
.collect(Collectors.toList());

But generally, using stateful filters with streams is discouraged.

A cleaner solution would be

List<String> result = Stream.of(5,6,7,7,7,6,2,4,2,4)
.collect(Collectors.collectingAndThen(
Collectors.toMap(String::valueOf, x -> true, (a,b) -> false, TreeMap::new),
map -> { map.values().removeIf(b -> b); return new ArrayList<>(map.keySet()); }));

Note that this approach doesn’t count the occurrences but only remembers whether an element is unique or has seen at least a second time. This works by mapping each element to true with the second argument to the toMap collector, x -> true, and resolving multiple occurrences with a merge function of (a,b) -> false. The subsequent map.values().removeIf(b -> b) will remove all unique elements, i.e. those mapped to true.

Extract duplicate objects from a List in Java 8

If you could implement equals and hashCode on Person you could then use a counting down-stream collector of the groupingBy to get distinct elements that have been duplicated.

List<Person> duplicates = personList.stream()
.collect(groupingBy(identity(), counting()))
.entrySet().stream()
.filter(n -> n.getValue() > 1)
.map(n -> n.getKey())
.collect(toList());

If you would like to keep a list of sequential repeated elements you can then expand this out using Collections.nCopies to expand it back out. This method will ensure repeated elements are ordered together.

List<Person> duplicates = personList.stream()
.collect(groupingBy(identity(), counting()))
.entrySet().stream()
.filter(n -> n.getValue() > 1)
.flatMap(n -> nCopies(n.getValue().intValue(), n.getKey()).stream())
.collect(toList());

Find duplicate value in map using Java Stream API

You can use Collectors.groupingBy to group by key and use Collectors.mapping to map values and collect as a list for each key.

Map<String, List<String>> idToMap = 
roles.stream()
.collect(Collectors.groupingBy(e -> e,
Collectors.mapping(e -> getName(e), Collectors.toList())));

Or map operation is lazy, so code inside .map is not executed. You can use terminal operation like forEach by refactoring your current code,

roles.forEach(role -> {
if (idToMap.containsKey(role)) {
idToMap.get(role).add(getName(role));
} else {
idToMap.put(role, new ArrayList<>(Arrays.asList(getName(role))));
}
});

which can be simplified using Map's merge method

roles.forEach(
role -> idToMap.merge(role, new ArrayList<>(Arrays.asList(getName(role))), (a, b) -> {
a.addAll(b);
return a;
}));

Update: If you want just print for duplicate value, you can use Collectors.counting() to get the frequency of key as a result collect as Map<String, Integer>

roles.stream()
.collect(Collectors.groupingBy(e -> e, Collectors.counting()))
.entrySet()
.forEach(e -> {
if (e.getValue() > 1) {
System.out.println("found a key which has duplicate value : " + e.getKey());
}
});


Related Topics



Leave a reply



Submit