Why Is It Important to Override Gethashcode When Equals Method Is Overridden

Why is it important to override GetHashCode when Equals method is overridden?

Yes, it is important if your item will be used as a key in a dictionary, or HashSet<T>, etc - since this is used (in the absence of a custom IEqualityComparer<T>) to group items into buckets. If the hash-code for two items does not match, they may never be considered equal (Equals will simply never be called).

The GetHashCode() method should reflect the Equals logic; the rules are:

  • if two things are equal (Equals(...) == true) then they must return the same value for GetHashCode()
  • if the GetHashCode() is equal, it is not necessary for them to be the same; this is a collision, and Equals will be called to see if it is a real equality or not.

In this case, it looks like "return FooId;" is a suitable GetHashCode() implementation. If you are testing multiple properties, it is common to combine them using code like below, to reduce diagonal collisions (i.e. so that new Foo(3,5) has a different hash-code to new Foo(5,3)):

In modern frameworks, the HashCode type has methods to help you create a hashcode from multiple values; on older frameworks, you'd need to go without, so something like:

unchecked // only needed if you're compiling with arithmetic checks enabled
{ // (the default compiler behaviour is *disabled*, so most folks won't need this)
int hash = 13;
hash = (hash * 7) + field1.GetHashCode();
hash = (hash * 7) + field2.GetHashCode();
...
return hash;
}

Oh - for convenience, you might also consider providing == and != operators when overriding Equals and GetHashCode.


A demonstration of what happens when you get this wrong is here.

Why do I need to override the .Equals and GetHashCode in C#

You need to override the two methods for any number of reasons. The GetHashCode is used for insertion and lookup in Dictionary and HashTable, for example. The Equals method is used for any equality tests on the objects. For example:

public partial class myClass
{
public override bool Equals(object obj)
{
return base.Equals(obj);
}

public override int GetHashCode()
{
return base.GetHashCode();
}
}

For GetHashCode, I would have done:

  public int GetHashCode()
{
return PersonId.GetHashCode() ^
Name.GetHashCode() ^
Age.GetHashCode();
}

If you override the GetHashCode method, you should also override Equals, and vice versa. If your overridden Equals method returns true when two objects are tested for equality, your overridden GetHashCode method must return the same value for the two objects.

Why do I need to override the equals and hashCode methods in Java?

Joshua Bloch says on Effective Java

You must override hashCode() in every class that overrides equals(). Failure to do so will result in a violation of the general contract for Object.hashCode(), which will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap, HashSet, and Hashtable.

Let's try to understand it with an example of what would happen if we override equals() without overriding hashCode() and attempt to use a Map.

Say we have a class like this and that two objects of MyClass are equal if their importantField is equal (with hashCode() and equals() generated by eclipse)

public class MyClass {
private final String importantField;
private final String anotherField;

public MyClass(final String equalField, final String anotherField) {
this.importantField = equalField;
this.anotherField = anotherField;
}

@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result
+ ((importantField == null) ? 0 : importantField.hashCode());
return result;
}

@Override
public boolean equals(final Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
final MyClass other = (MyClass) obj;
if (importantField == null) {
if (other.importantField != null)
return false;
} else if (!importantField.equals(other.importantField))
return false;
return true;
}
}

Imagine you have this

MyClass first = new MyClass("a","first");
MyClass second = new MyClass("a","second");

Override only equals

If only equals is overriden, then when you call myMap.put(first,someValue) first will hash to some bucket and when you call myMap.put(second,someOtherValue) it will hash to some other bucket (as they have a different hashCode). So, although they are equal, as they don't hash to the same bucket, the map can't realize it and both of them stay in the map.


Although it is not necessary to override equals() if we override hashCode(), let's see what would happen in this particular case where we know that two objects of MyClass are equal if their importantField is equal but we do not override equals().

Override only hashCode

If you only override hashCode then when you call myMap.put(first,someValue) it takes first, calculates its hashCode and stores it in a given bucket. Then when you call myMap.put(second,someOtherValue) it should replace first with second as per the Map Documentation because they are equal (according to the business requirement).

But the problem is that equals was not redefined, so when the map hashes second and iterates through the bucket looking if there is an object k such that second.equals(k) is true it won't find any as second.equals(first) will be false.

Hope it was clear

Why should GetHashCode implement the same logic as Equals?

When storing a value in a hash table, such as Dictionary<>, the framework will first call GetHashCode() and check if there's already a bucket in the hash table for that hash code. If there is, it will call .Equals() to see if the new value is indeed equal to the existing value. If not (meaning the two objects are different, but result in the same hash code), you have what's known as a collision. In this case, the items in this bucket are stored as a linked list and retrieving a certain value becomes O(n).

If you implemented GetHashCode() but did not implement Equals(), the framework would resort to using reference equality to check for equality which would result in every instance creating a collision.

If you implemented Equals() but did not implement GetHashCode(), you might run into a situation where you had two objects that were equal, but resulted in different hash codes meaning they'd maintain their own separate values in your hash table. This would potentially confuse anyone using your class.

As far as what objects are considered equal, that's up to you. If I create a hash table based on temperature, should I be able to refer to the same item using either its Celsius or Fahrenheit value? If so, they need to result in the same hash value and Equals() needs to return true.

Update:

Let's step back and take a look at the purpose of a hash code in the first place. Within this context, a hash code is used as a quick way to identify if two objects are most likely equal. If we have two objects that have different hash codes, we know for a fact they are not equal. If we have two objects that have the same hash code, we know they are most likely equal. I say most likely because an int can only be used to represent a few billion possible values, and strings can of course contain the complete works of Charles Dickens, or any number of possible values. Much in the .NET framework is based on these truths, and developers that use your code will assume things work in a way that is consistent with the rest of the framework.

If you were to have two instances that have different hash codes, but have an implementation of Equals() that returns true, you're breaking this convention. A developer that compares two objects might then use one of of those objects to refer to a key in a hash table and expect to get an existing value out. If all of a sudden the hash code is different, this code might result in a runtime exception instead. Or perhaps return a reference to a completely different object.

Whether 295.15k and 22C are equal within the domain of your program is your choice (In my opinion, they are not). However, whatever you decide, objects that are equal must return the same has code.

how to implement override of GetHashCode() with logic of overriden Equals()

Firstly, as I think you understand, wherever you implement Equals you MUST also implement GetHashCode. The implementation of GetHashCode must reflect the behaviour of the Equals implementation but it doesn't usually use it.

See http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx - especially the "Notes to Implementers"

So if you take your example of the Item implementation of Equals, you're considering both the values of id and name to affect equality. So both of these must contribute to the GetHashCode implementation.

An example of how you could implement GetHashCode for Item would be along the lines of the following (note you may need to make it resilient to a nullable name field):

public override GetHashCode()
{
return id.GetHashCode() ^ name.GetHashCode();
}

See Eric Lippert's blog post on guidelines for GetHashCode - http://ericlippert.com/2011/02/28/guidelines-and-rules-for-gethashcode/

As for whether you need to re-implement GetHashCode in subclasses - Yes if you also override Equals - as per the first (and main) point - the implementation of the two must be consistent - if two items are considered equal by Equals then they must return the same value from GetHashCode.

Side note:
As a performance improvement on your code (avoid multiple casts):

if ( obj is Param){
Param p = (Param)(obj);

Param p = obj as Param;
if (p != null) ...

Why should I override hashCode() when I override equals() method?

It works for you because your code does not use any functionality (HashMap, HashTable) which needs the hashCode() API.

However, you don't know whether your class (presumably not written as a one-off) will be later called in a code that does indeed use its objects as hash key, in which case things will be affected.

As per the documentation for Object class:

The general contract of hashCode is:

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

Overriding GetHashCode

When you override GetHashCode() you also need to override Equals(), operator== and operator!= . And be very careful to meet all the requirements for those methods.

The guidelines are here on MSDN. Most important quote:

It is not a good idea to override operator == in mutable types.



Related Topics



Leave a reply



Submit