How Should Equals and Hashcode Be Implemented When Using JPA and Hibernate

How should equals and hashcode be implemented when using JPA and Hibernate

Hibernate has a nice and long description of when / how to override equals() / hashCode() in documentation

The gist of it is you only need to worry about it if your entity will be part of a Set or if you're going to be detaching / attaching its instances. The latter is not that common. The former is usually best handled via:

Basing equals() / hashCode() on a business key - e.g. a unique combination of attributes that is not going to change during object (or, at least, session) lifetime.
If the above is impossible, base equals() / hashCode() on primary key IF it's set and object identity / System.identityHashCode() otherwise. The important part here is that you need to reload your Set after new entity has been added to it and persisted; otherwise you may end up with strange behavior (ultimately resulting in errors and / or data corruption) because your entity may be allocated to a bucket not matching its current hashCode().

Hibernate: When is it necessary to implement equals() and hashCode(), and if so, how?

First of all, your original idea, that you should implement equals() and hashCode() only on immutable objects, certainly works, but it's stricter than it needs to be. You just need these two methods to rely on immutable fields. Any field whose value may change is unsuitable for use in those two methods, but the other fields need not be immutable.

Having said that, Hibernate knows they're the same object by comparing their primary keys. This leads many people to write those two methods to rely on the primary key. Hibernate docs recommend you don't do it this way, but many people ignore this advice without much trouble. It means you can't add entities to a Set until after they've been persisted, which is a restriction that's not too hard to live with.

Hibernate docs recommend using a business key. But the business key should rely on fields that uniquely identify an object. The Hibernate docs say "use a business key that is a combination of unique, typically immutable, attributes." I use fields that have a unique constraint on them in the database. So, if your Sql CREATE TABLE statement specifies a constraint as

CONSTRAINT uc_order_num_item UNIQUE (order_num, order_item)

then those two fields can be your business key. That way, if you change one of them, both Hibernate and Java will treat the modified object as a different object. Of course, if you do change one of these "immutable" fields, you mess up any Set they belong to. So I guess you need to document clearly which fields comprise the business key, and write your application with the understanding that fields in the business key should never be changed for persisted objects. I can see why people ignore the advice and just use the primary key. But you could define the primary key like this:

CONSTRAINT pk_order_num_item PRIMARY KEY (order_num, order_item)

And you would still have the same problem.

Personally, I would like to see an annotation that specifies every field in the business key, and have an IDE inspection that checks if I modify it for persisted objects. Maybe that's asking too much.

Another approach, one that solves all of these problems, is to use a UUID for the primary key, which you generate on the client when you first construct an unpersisted entity. Since you never need to show it to the user, your code is not likely to change its value once you set it. This lets you write hashCode() and equals() methods that always work, and remain consistent with each other.

One more thing: If you want to avoid the problem of adding an object to a Set that already contains a different (modified) version of it, the only way is to always ask the set if it's already there before adding it. Then you can write code to handle that special case.

Object equality/hashcode vs JPA/Hibernate entity equality/hashcode

Basically, you'd get some problems when you have bi-directional relationships between entities. For example, if Entity1 has @OneToMany access to Entity2, and Entity2 has @ManyToOne access to EntityId, and both of these entities have @EqualsAndHashcode without specifying fields (i.e., equals and hashcode are generated for all fields including those for relations). In this case, you'd have a circular reference, hence a StackOverflow exception.

In order to avoid that, you can rely only on a field with @Id for constructing equals and hashcode (there are some examples with this approach in hibernate docs). But in this case, you'd get another kind of problems, e.g. if you store transient entities with auto-generated ids in a set (as child entities for some parent one), it wouldn't work correctly because the id field will be null in this case. Probably, you'd need to use some other fields in equals and hashcode in this case.

So, there is no correct answer to this question. You need to make a decision every time you construct your entities.

Why Hibernate requires us to implement equals/hashcode methods when I have a private id field?

If the entity defines a natural business key, then you should use that for equals and hashCode. The natural identifier or business key is consistent across all entity state transitions, hence the hashCode will not change when the JPA entity state changes (e.g. from New to Managed to Detached).

In your example, you are using the assigned identifier, which doesn't change when you persist your entity.

However, if you don't have a natural identifier and you have a generated PRIMARY KEY (e.g., IDENTITY, SEQUENCE), then you can implement equals and hashCode like this:

@Entity
public class Book implements Identifiable<Long> {
 
    @Id
    @GeneratedValue
    private Long id;
 
    private String title;
 
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
 
        if (!(o instanceof Book))
            return false;
 
        Book other = (Book) o;
 
        return id != null &&
               id.equals(other.getId());
    }
 
    @Override
    public int hashCode() {
        return getClass().hashCode();
    }
 
    //Getters and setters omitted for brevity
}

The entity identifier can be used for equals and hashCode, but only if the hashCode returns the same value all the time. This might sound like a terrible thing to do since it defeats the purpose of using multiple buckets in a HashSet or HashMap.

However, for performance reasons, you should always limit the number of entities that are stored in a collection. You should never fetch thousands of entities in a @OneToMany Set because the performance penalty on the database side is multiple orders of magnitude higher than using a single hashed bucket.

The reason why this version of equals and hashCode works is that the hashCode value does not change from one entity state to another, and the identifier is checked only when it's not null.

Which choice is better for generating equals() and hashCode() methods in Hibernate?

Use the option that generates fewer code and no need 3rd party libraries. I prefer Java7+. Include just the primary key field because the important thing is to verify if 2 differents instances are representing the same row in database. There's no need to verify if all field values are the same.

You can read more about here

How Should Equals and Hashcode Be Implemented When Using JPA and Hibernate