Correct Way to Override Equals() and Gethashcode()

Correct way to override Equals() and GetHashCode()

You can override Equals() and GetHashCode() on your class like this:

public override bool Equals(object obj)
{
var item = obj as RecommendationDTO;

if (item == null)
{
return false;
}

return this.RecommendationId.Equals(item.RecommendationId);
}

public override int GetHashCode()
{
return this.RecommendationId.GetHashCode();
}

Why is it important to override GetHashCode when Equals method is overridden?

Yes, it is important if your item will be used as a key in a dictionary, or HashSet<T>, etc - since this is used (in the absence of a custom IEqualityComparer<T>) to group items into buckets. If the hash-code for two items does not match, they may never be considered equal (Equals will simply never be called).

The GetHashCode() method should reflect the Equals logic; the rules are:

  • if two things are equal (Equals(...) == true) then they must return the same value for GetHashCode()
  • if the GetHashCode() is equal, it is not necessary for them to be the same; this is a collision, and Equals will be called to see if it is a real equality or not.

In this case, it looks like "return FooId;" is a suitable GetHashCode() implementation. If you are testing multiple properties, it is common to combine them using code like below, to reduce diagonal collisions (i.e. so that new Foo(3,5) has a different hash-code to new Foo(5,3)):

In modern frameworks, the HashCode type has methods to help you create a hashcode from multiple values; on older frameworks, you'd need to go without, so something like:

unchecked // only needed if you're compiling with arithmetic checks enabled
{ // (the default compiler behaviour is *disabled*, so most folks won't need this)
int hash = 13;
hash = (hash * 7) + field1.GetHashCode();
hash = (hash * 7) + field2.GetHashCode();
...
return hash;
}

Oh - for convenience, you might also consider providing == and != operators when overriding Equals and GetHashCode.


A demonstration of what happens when you get this wrong is here.

how to implement override of GetHashCode() with logic of overriden Equals()

Firstly, as I think you understand, wherever you implement Equals you MUST also implement GetHashCode. The implementation of GetHashCode must reflect the behaviour of the Equals implementation but it doesn't usually use it.

See http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx - especially the "Notes to Implementers"

So if you take your example of the Item implementation of Equals, you're considering both the values of id and name to affect equality. So both of these must contribute to the GetHashCode implementation.

An example of how you could implement GetHashCode for Item would be along the lines of the following (note you may need to make it resilient to a nullable name field):

public override GetHashCode()
{
return id.GetHashCode() ^ name.GetHashCode();
}

See Eric Lippert's blog post on guidelines for GetHashCode - http://ericlippert.com/2011/02/28/guidelines-and-rules-for-gethashcode/

As for whether you need to re-implement GetHashCode in subclasses - Yes if you also override Equals - as per the first (and main) point - the implementation of the two must be consistent - if two items are considered equal by Equals then they must return the same value from GetHashCode.

Side note:
As a performance improvement on your code (avoid multiple casts):

if ( obj is Param){
Param p = (Param)(obj);

Param p = obj as Param;
if (p != null) ...

Why do I need to override the .Equals and GetHashCode in C#

You need to override the two methods for any number of reasons. The GetHashCode is used for insertion and lookup in Dictionary and HashTable, for example. The Equals method is used for any equality tests on the objects. For example:

public partial class myClass
{
public override bool Equals(object obj)
{
return base.Equals(obj);
}

public override int GetHashCode()
{
return base.GetHashCode();
}
}

For GetHashCode, I would have done:

  public int GetHashCode()
{
return PersonId.GetHashCode() ^
Name.GetHashCode() ^
Age.GetHashCode();
}

If you override the GetHashCode method, you should also override Equals, and vice versa. If your overridden Equals method returns true when two objects are tested for equality, your overridden GetHashCode method must return the same value for the two objects.

Best practices for Entity Framework entities override Equals and GetHashCode

Entity Framework uses its own smart methods to detect object equality. This is for instance used if you call SaveChanges: the values of fetched objects are matched with the values of updated objects to detect whether a SQL update is needed or not.

I'm not sure whether your definitions of equality would mess with this equality checking, causing some unchanged items to be updated in the database, or even worse, some changed data not to be updated in the database.

Database equality

Keep in mind that your entity classes (the classes that you put in the DbSet<...>) represent the tables in your database and the relations between the tables.

When should two items extracted from your database considered to represent the same object? Is it when they have same values? Can't we have two Persons named "John Doe", born on the 4th of July in one database?

The only way you can use to detect that two extracted Persons from the database represent the same Person is by checking the Id. The fact that some non-primary key values differ only tells you that the changed data is not updated in the database, not that it is a different Person.

Override Equals vs Create EqualityComparer

My advice would be, to keep your table representations as simple as possible: only the columns of the table (non-virtual properties) and the relations between the tables (virtual properties). No members, no Methods, nothing.

If you need extra functionality, create extension functions of the classes. If you need non-standard equality comparison methods, create a separate equality comparer. Users of your class can decide whether they want to use the default comparison method or your special comparison method.

This is all comparable as the various kinds of String Comparers: StringComparer.OrdinalIgnorCase, StringComparer.InvariantCulture, etc.

Back to your question

It seems to me that you want a Gpu comparer that does not check the value of Id: two items that have different Id, but same values for other properties are considered equal.

class GpuComparer : EqualityComparer<Gpu>
{
public static IEqualityComparer<Gpu> IgnoreIdComparer {get;} = new GpuComparer()

public override bool Equals(Gpu x, Gpu y)
{
if (x == null) return y == null; // true if both null, false if x null but y not
if (y == null) return false; // because x not null
if (Object.ReferenceEquals(x, y)) return true;
if (x.GetType() != y.GetType()) return false;

// if here, we know x and y both not null, and of same type.
// compare all properties for equality
return x.Cores == y.Cores;
}
public override int GetHasCode(Gpu x)
{
if (x == null) throw new ArgumentNullException(nameof(x));

// note: I want a different Hash for x.Cores == null than x.Cores == 0!

return (x.Cores.HasValue) ? return x.Cores.Value.GetHashCode() : -78546;
// -78546 is just a value I expect that is not used often as Cores;
}
}

Note that I added the test for same type, because if y is a derived class of Gpu, and you would ignore that they are not the same type, then maybe Equals(x, y), but not Equals(y, x), which is one of the prerequisites of equality functions

Usage:

IEqualityComparer<Gpu> gpuIgnoreIdComparer = GpuComparer.IgnoreIdComparer;
Gpu x = new Gpu {Id = 0, Cores = null}
Gpu y = new Gpu {Id = 1, Cores = null}

bool sameExceptForId = gpuIgnoreIdComparer.Equals(x, y);

x and y will be considered equal

HashSet<Gpu> hashSetIgnoringIds = new HashSet<Gpu>(GpuComparer.IgnoreIdComparer);
hashSetIgnoringIds.Add(x);
bool containsY = hashSetIgnoringIds.Contains(y); // expect true

A comparer for Computer will be similar. Apart that you forgot to check for null and types, I see some other problems in the way you want to do the equality checking:

  • it is possible to assign null to your collection of Gpus. You have to solve this that it does not throw an exception. Is a Computer with null Gpus equal to a Computer with zero Gpus?
  • Apparently the order of the Gpus is not important to you: [1, 3] is equal to [3, 1]
  • Apparently the number of times that a certain GPU appears is not important: [1, 1, 3] is equal to [1, 3, 3]?

.

class IgnoreIdComputerComparer : EqualityComparer<Computer>
{
public static IEqualityComparer NoIdComparer {get} = new IgnoreIdComputerCompare();


public override bool (Computer x, Computer y)
{
if (x == null) return y == null;not null
if (y == null) return false;
if (Object.ReferenceEquals(x, y)) return true;
if (x.GetType() != y.GetType()) return false;

// equal if both GPU collections null or empty,
// or any element in X.Gpu is also in Y.Gpu ignoring duplicates
// using the Gpu IgnoreIdComparer
if (x.Gpus == null || x.Gpus.Count == 0)
return y.Gpus == null || y.Gpus.Count == 0;

// equal if same elements, ignoring duplicates:
HashSet<Gpu> xGpus = new HashSet<Gpu>(x, GpuComparer.IgnoreIdComparer);
return xGpush.EqualSet(y);
}

public override int GetHashCode(Computer x)
{
if (x == null) throw new ArgumentNullException(nameof(x));

if (x.Gpus == null || x.Gpus.Count == 0) return -784120;

HashSet<Gpu> xGpus = new HashSet<Gpu>(x, GpuComparer.IgnoreIdComparer);
return xGpus.Sum(gpu => gpu);
}
}

TODO: if you will be using large collections of Gpus, consider a smarter GetHashCode

Override Equals and GetHashCode in class with one field

For one thing you can simplify both your methods:

 public override bool Equals(object obj)
{
if (obj == null || obj.GetType() != GetType())
{
return false;
}

AbstractDictionaryObject other = (AbstractDictionaryObject)obj;
return other.LangId == LangId;
}

public override int GetHashCode()
{
return LangId;
}

But at that point it should be fine. If the two derived classes have other fields, they should override GetHashCode and Equals themselves, first calling base.Equals or base.GetHashCode and then applying their own logic.

Two instances of Derived1 with the same LangId will be equivalent as far as AbstractDictionaryObject is concerned, and so will two instances of Derived2 - but they will be different from each other as they have different types.

If you wanted to give them different hash codes you could change GetHashCode() to:

 public override int GetHashCode()
{
int hash = 17;
hash = hash * 31 + GetType().GetHashCode();
hash = hash * 31 + LangId;
return hash;
}

However, hash codes for different objects don't have to be different... it just helps in performance. You may want to do this if you know you will have instances of different types with the same LangId, but otherwise I wouldn't bother.

What's the best strategy for Equals and GetHashCode?

Assuming that the instances are equal because the hash codes are equal is wrong.

I guess your implementation of GetHashCode is OK, but I usually use things similar to this:

public override int GetHashCode() {
return object1.GetHashCode ^ intValue1 ^ (intValue2 << 16);
}

General advice and guidelines on how to properly override object.GetHashCode()


Table of contents

  • When do I override object.GetHashCode?

  • Why do I have to override object.GetHashCode()?

  • What are those magic numbers seen in GetHashCode implementations?


Things that I would like to be covered, but haven't been yet:

  • How to create the integer (How to "convert" an object into an int wasn't very obvious to me anyways).
  • What fields to base the hash code upon.

    • If it should only be on immutable fields, what if there are only mutable ones?
  • How to generate a good random distribution. (MSDN Property #3)

    • Part to this, seems to choose a good magic prime number (have seen 17, 23 and 397 been used), but how do you choose it, and what is it for exactly?
  • How to make sure the hash code stays the same all through the object lifetime. (MSDN Property #2)

    • Especially when the equality is based upon mutable fields. (MSDN Property #1)
  • How to deal with fields that are complex types (not among the built-in C# types).

    • Complex objects and structs, arrays, collections, lists, dictionaries, generic types, etc.
    • For example, even though the list or dictionary might be readonly, that doesn't mean the contents of it are.
  • How to deal with inherited classes.

    • Should you somehow incorporate base.GetHashCode() into your hash code?
  • Could you technically just be lazy and return 0? Would heavily break MSDN guideline number #3, but would at least make sure #1 and #2 were always true :P
  • Common pitfalls and gotchas.


Related Topics



Leave a reply



Submit