Java - Best way to compare two Sets by hashes (via ==)
Since the two Set
s are not the same instance, you can't use ==
to compare the Set
s.
Now, if you use Set
's equals
, it (and by "it" I mean the default implementation of AbstractSet
) will validate the two Set
s have the same size and then iterate over the elements of one Set
and check if the other Set
contains each of them.
If, for example, you are using HashSet
s, in order to find if a String
is contained in some HashSet
, hashCode()
will have to be used to find the bucket that may contain the searched String
, but later String
's equals()
will be called to validate the presence of a String
equal to the searched String
. String
's equals()
actually begins with
if (this == anObject) {
return true;
}
so the fact that your two Set
s may contain references to the same String
instance will be helpful in improving the runtime of the Set
s comparison.
Compare two sets of different types
This is what I would do in your case. You have sets. Sets are hard to compare, but on top of that, you want to compare on their id.
I see only one proper solution where you have to normalize the wanted values (extract their id) then sort those ids, then compare them in order, because if you don't sort and compare you can possibly skip pass over duplicates and/or values.
Think about the fact that Java 8 allows you to play lazy with streams. So don't rush over and think that extracting, then sorting then copying is long. Lazyness allows it to be rather fast compared to iterative solutions.
HashSet<ValueObject> valueObjects = new HashSet<>();
valueObjects.add(ValueObject.of(1));
valueObjects.add(ValueObject.of(2));
valueObjects.add(ValueObject.of(3));
HashSet<DTO> dtos = new HashSet<>();
dtos.add(DTO.of(1));
dtos.add(DTO.of(2));
dtos.add(DTO.of(34));
boolean areIdentical = Arrays.equals(
valueObjects.stream()
.mapToInt((v) -> v.id)
.sorted()
.toArray(),
dtos.stream()
.mapToInt((d) -> d.id)
.sorted()
.toArray()
);
You want to generalize the solution? No problem.
public static <T extends Comparable<?>> boolean areIdentical(Collection<ValueObject> vos, Function<ValueObject, T> voKeyExtractor, Collection<DTO> dtos, Function<DTO, T> dtoKeyExtractor) {
return Arrays.equals(
vos.stream()
.map(voKeyExtractor)
.sorted()
.toArray(),
dtos.stream()
.map(dtoKeyExtractor)
.sorted()
.toArray()
);
}
And for a T
that is not comparable:
public static <T> boolean areIdentical(Collection<ValueObject> vos, Function<ValueObject, T> voKeyExtractor, Collection<DTO> dtos, Function<DTO, T> dtoKeyExtractor, Comparator<T> comparator) {
return Arrays.equals(
vos.stream()
.map(voKeyExtractor)
.sorted(comparator)
.toArray(),
dtos.stream()
.map(dtoKeyExtractor)
.sorted(comparator)
.toArray()
);
}
You mention Guava and if you don't have Java 8, you can do the following, using the same algorithm:
List<Integer> voIds = FluentIterables.from(valueObjects)
.transform(valueObjectIdGetter())
.toSortedList(intComparator());
List<Integer> dtoIds = FluentIterables.from(dtos)
.transform(dtoIdGetter())
.toSortedList(intComparator());
return voIds.equals(dtoIds);
Getting the difference between two sets
Try this
test2.removeAll(test1);
Set#removeAll
Removes from this set all of its elements that are contained in the specified collection (optional operation). If the specified collection is also a set, this operation effectively modifies this set so that its value is the asymmetric set difference of the two sets.
How can I (more) easily compare two sets of numbers?
EDIT: I just realized you said these sets are hardcoded at 3 values. This is a super general algorithm for sets of any size.
For a 3-value set, you can do the same dumping and sorting of set elements, and then do:
if(setB.a < setA.a)
if(setB.b < setA.b)
if(setB.c < setA.c)
return true;
return false;
======================================================
A general algorithm:
This is the most efficient way I can immediately think of to do this.
pseudocode (more pythonic than java, sorry-- hopefully the comments explain):
list l1 = set1.items() //get the items out
list l2 = set2.items()
l1 = sort(l1)
l2 = sort(l2) //sort the lists
int set2idx1 = l1[0].find_closest_greater_than_value(l2) //binary search or something
if set2idx1 exists:
l2 = l2[set2idx1+1:] //in python this means l2 is reassigned to a subarray of l2 starting at set2idx1+1 going to the end of l2
else:
return false
for(int i=1; i<l1.len; i++)
int set2idxi = l1[i].find_closest_greater_than_value(l2) //binary search or something
if set2idxi exists:
l2 = l2[set2idxi+1:]
else
return false
return true
comment if anything doesn't make sense
EDIT EDIT:
explanation of the general algorithm for any interested parties:
- dump the set elements into arrays
- sort those arrays
- iterate through the first array, seeing if there is a value in the 2nd array that is greater than the current value. If so, get the index of that value and lop off everything before and including that index and reassign the 2nd array variable to what remains.
- if there ever is no such value (either because it doesn't exist or you've run out of values to test, return false). Otherwise, at the end, return true.
The idea here is that since the arrays are sorted, you know that any element that is greater than the matched element in the 2nd array will be greater than the element you're testing against in the 1st array. So you can just chop off the lower values, and since you don't want to use the same value, you can chop off the one you found, too. If you ever return false you know it's because either there were no greater values, either because the numbers in array1 were all greater than the numbers in array2, or because there weren't enough numbers in array2 greater than the numbers in array1.
How to compare two Sets of objects Not overriding equals/hashCode with Comparator and Streams
A Fix for Comparator
In order to create a Comparator
in this case you need to provide generic type information explicitly, like that: <Book, String>comparing()
, where Book
and String
are the parameter type and return type of the Function
that Comparator.comparing()
expects as a parameter.
Without an explicit declaration, the compiler has not enough data to determine the type of the variable book
, and inside both comparing()
and thenComparing()
its type will be inferred as Object
.
Comparator<Book> comparingBooks =
Comparator.<Book, String>comparing(book -> book.getName())
.thenComparing(book -> book.getType());
If the case if only one of the static methods was used, the type of the book
variable will be correctly inferred by the compiler as Book
based on the type of the locale variable Comparator<Book> comparingBooks
.
Both method lambda expressions could be replaced with method references:
Comparator<Book> comparingBooks =
Comparator.<Book, String>comparing(Book::getName)
.thenComparing(Book::getType);
for information on the syntax of generic methods, take a look at this tutorial
for more information on how to build comparators with Java 8 methods take a look at this tutorial
Stream
should validate any difference of Name and Type by returning true, if both Sets have exactly the same objects, it should return false.
Stream s1.stream().anyMatch(a -> s2.stream().anyMatch(b -> ...))
is currently checking whether there's an element in the first set s1
that is different from ANY of the elements in s2. I.e. the nested anyMatch()
will return true
for the first element encountered in s1 that has not matching element in s2. After the comparison of Book("book 1", BookType.ONE)
(s1) with Book("book 2", BookType.TWO)
(s2) anyMatch()
terminates by returning true
. And enclosing anyMatch()
operation propagates this result.
Precisely the same will happen with the second example. The same pair of elements differ, although you've changed a type, names are not equal. Result is true
.
Note:
- that by convention classes are usually named with singular nouns, i.e.
Book
(not Books),Event
,Person
, etc. - as a rule of thumb, if your objects are intended to be used with collection they must implement
equals/hashCode
contract, your requirement is very unusual.
Related Topics
Java Wait for Thread to Finish
How to Enumerate All Classes in a Package and Add Them to a List
Replace the Last Part of a String
Java 8 Instant.Now() with Nanosecond Resolution
Java Division by Zero Doesnt Throw an Arithmeticexception - Why
Comparing Float/Double Values Using == Operator
Drawing an Object Using Getgraphics() Without Extending Jframe
What Does an Assignment Expression Evaluate to in Java
Running Multiple Launch Configurations at Once
"Int Cannot Be Dereferenced" in Java
Java Date - Insert into Database
Generic Return Type Upper Bound - Interface VS. Class - Surprisingly Valid Code
Dbcp - Validationquery for Different Databases
Webdriver: Check If an Element Exists
How to Intentionally Cause a Custom Java Compiler Warning Message