Apache Commons Equals/Hashcode Builder

Apache Commons equals/hashCode builder

The commons/lang builders are great and I have been using them for years without noticeable performance overhead (with and without hibernate). But as Alain writes, the Guava way is even nicer:

Here's a sample Bean:

public class Bean{

private String name;
private int length;
private List<Bean> children;

}

Here's equals() and hashCode() implemented with Commons/Lang:

@Override
public int hashCode(){
return new HashCodeBuilder()
.append(name)
.append(length)
.append(children)
.toHashCode();
}

@Override
public boolean equals(final Object obj){
if(obj instanceof Bean){
final Bean other = (Bean) obj;
return new EqualsBuilder()
.append(name, other.name)
.append(length, other.length)
.append(children, other.children)
.isEquals();
} else{
return false;
}
}

and here with Java 7 or higher (inspired by Guava):

@Override
public int hashCode(){
return Objects.hash(name, length, children);
}

@Override
public boolean equals(final Object obj){
if(obj instanceof Bean){
final Bean other = (Bean) obj;
return Objects.equals(name, other.name)
&& length == other.length // special handling for primitives
&& Objects.equals(children, other.children);
} else{
return false;
}
}

Note: this code originally referenced Guava, but as comments have pointed out, this functionality has since been introduced in the JDK, so Guava is no longer required.

As you can see the Guava / JDK version is shorter and avoids superfluous helper objects. In case of equals, it even allows for short-circuiting the evaluation if an earlier Object.equals() call returns false (to be fair: commons / lang has an ObjectUtils.equals(obj1, obj2) method with identical semantics which could be used instead of EqualsBuilder to allow short-circuiting as above).

So: yes, the commons lang builders are very preferable over manually constructed equals() and hashCode() methods (or those awful monsters Eclipse will generate for you), but the Java 7+ / Guava versions are even better.

And a note about Hibernate:

be careful about using lazy collections in your equals(), hashCode() and toString() implementations. That will fail miserably if you don't have an open Session.


Note (about equals()):

a) in both versions of equals() above, you might want to use one or both of these shortcuts also:

@Override
public boolean equals(final Object obj){
if(obj == this) return true; // test for reference equality
if(obj == null) return false; // test for null
// continue as above

b) depending on your interpretation of the equals() contract, you might also change the line(s)

    if(obj instanceof Bean){

to

    // make sure you run a null check before this
if(obj.getClass() == getClass()){

If you use the second version, you probably also want to call super(equals()) inside your equals() method. Opinions differ here, the topic is discussed in this question:

right way to incorporate superclass into a Guava Objects.hashcode() implementation?

(although it's about hashCode(), the same applies to equals())


Note (inspired by Comment from kayahr)

Objects.hashCode(..) (just as the underlying Arrays.hashCode(...)) might perform badly if you have many primitive fields. In such cases, EqualsBuilder may actually be the better solution.

Guava Vs Apache Commons Hash/Equals builders

I'd call this difference "existence". There are EqualsBuilder and HashCodeBuilder in Apache Commons and there are no builders in Guava. All you get from Guava is a utility class MoreObjects (renamed from Objects as there's such a class in JDK now).

The advantages of Guava's approach come from the non-existence of the builder:

  • it produces no garbage
  • it's faster

The JIT compiler can possibly eliminate the garbage via Escape Analysis and also the associated overhead. Then they get equally fast as they do exactly the same.

I personally find the builders slightly more readable. If you find not using them better, then Guava is surely the right thing for you. As you can see, the static methods are good enough for the task.

Note also that there's also a ComparisonChain which is a sort of Comparable-builder.

HashCodeBuilder and EqualsBuilder usage style

Of course the second option is more elegant and simple. But if you are concerned about performance you should go for first approach. Second method also fails if a security manager is running.
I would go for the first option if I was in your situation.

Also there is a mistake in your first approach in generating hashCode:

It should be builder.toHashCode()
instead of builder.hashCode(). The latter returns hashcode builder object's hash code.

Apache Commons Lang HashCodeBuilder collision

You might be able to more optimally distribute your generated hash codes by adding more parameters when generating the hash code (this is independent of the Apache commons library). With this example, you could pre-compute one or more properties of the Route class and use this property when generating the hash code. For instance, calculate the slope of the line between the two Cell objects:

double slope = (startCell.getEast() - endCell.getEast());
if ( slope == 0 ){//prevent division by 0
slope = startCell.getSouth() - endCell.getSouth();
}else{
slope = (startCell.getSouth() - endCell.getSouth()) / slope;
}

return new HashCodeBuilder(43, 59)
.append(this.startCell)
.append(this.endCell)
.append(slope)
.toHashCode();

Generates 83091911 83088489 with your example. Alternatively (or together with) use the distance between the two Cell objects:

double length = Math.sqrt(Math.pow(startCell.getSouth() - endCell.getSouth(), 2) + Math.pow(startCell.getEast() - endCell.getEast(), 2));
return new HashCodeBuilder(43, 59)
.append(this.startCell)
.append(this.endCell)
.append(length)
.toHashCode();

Which used alone with your example results in 83091911 and -486891382.

And to test if this prevents collision:

List<Cell> cells = new ArrayList<Cell>();
for ( int i = 0; i < 50; i++ ){
for ( int j = 0; j < 50; j++ ){
Cell c = new Cell(i,j);
cells.add(c);

}
}
System.out.println(cells.size() + " cells generated");
System.out.println("Testing " + (cells.size()*cells.size()) + " number of Routes");
Set<Integer> set = new HashSet<Integer>();
int collisions = 0;
for ( int i = 0; i < cells.size(); i++ ){
for ( int j = 0; j < cells.size(); j++ ){
Route r = new Route(cells.get(i), cells.get(j));
if ( set.contains(r.hashCode() ) ){
collisions++;
}
set.add(r.hashCode());
}
}
System.out.println(collisions);

Amongst 6,250,000 Routes generated:

  1. Without length and slope: 6,155,919 collisions
  2. With length and slope: 873,047 collisions

HashCodeBuilder use, and How and Why are Java hashCode computed for objects with fields?

HashCodeBuilder and EqualsBuilders are not part of the JDK, they are features of the Apache Commons Lang project. Java doesn't use reflection to "guess" the right equals and hashcode operations because knowing what the programmer intended is impossible. By default, objects do have a default hashcode and equals method (which is why you are overriding them), and each object has a pseudo-unique hashcode and is only equal to itself.

In some programs, it may be correct to have equality and hashcodes unique to each individual object, rather than reflectively inspecting the fields at runtime. In other programs, programmers may which to disregard certain fields and have equality and hashcode only operate on a subset of an objects fields. There are infinitely many possibilities and combinations of fields that a programmer might intend or not intend to use for equality, which is why Java makes the safest assumption and makes each object equal only to itself.



Related Topics



Leave a reply



Submit