How to Use the Iequalitycomparer

How to use the IEqualityComparer

Your GetHashCode implementation always returns the same value. Distinct relies on a good hash function to work efficiently because it internally builds a hash table.

When implementing interfaces of classes it is important to read the documentation, to know which contract you’re supposed to implement.1

In your code, the solution is to forward GetHashCode to Class_reglement.Numf.GetHashCode and implement it appropriately there.

Apart from that, your Equals method is full of unnecessary code. It could be rewritten as follows (same semantics, ¼ of the code, more readable):

public bool Equals(Class_reglement x, Class_reglement y)
{
return x.Numf == y.Numf;
}

Lastly, the ToList call is unnecessary and time-consuming: AddRange accepts any IEnumerable so conversion to a List isn’t required. AsEnumerable is also redundant here since processing the result in AddRange will cause this anyway.


1 Writing code without knowing what it actually does is called cargo cult programming. It’s a surprisingly widespread practice. It fundamentally doesn’t work.

When should I use IEqualityComparer? C#

when should I use it?

Some possibilities:

  • When your definition of "equality" is more complicated than just comparing one property
  • When you want to pre-define "equality" for use in many queries
  • When you want to define "equality" outside of Linq, e.g. when using the class as the key to a hash table
  • When you want to tweak your definition of equality slightly without repeating code (i.e. turn on/off case sensitivity)

Implement IEqualityComparer

Try this:

var distinct = collection.Distinct(new MessageComparer());

Then use distinct for anything after that.

It looks like you're forgetting the immutable nature of IEnumerable<>. None of the LINQ methods actually change the original variable. Rather, they return IEnuerable<T>s which contain the result of the expression. For example, let's consider a simple List<string> original with the contents { "a", "a", "b", "c" }.

Now, let's call original.Add("d");. That method has no return value (it's void). But if we then print out the contents of original, we will see { "a", "a", "b", "c", "d" }.

On the other hand, let's now call original.Skip(1). This method does have a return value, one of type IEnumerable<string>. It is a LINQ expression, and performs no side-effecting actions on the original collection. Thus, if we call that and look at original, we will see { "a", "a", "b", "c", "d" }. However, the result from the method will be { "a", "b", "c", "d" }. As you can see, the result skips one element.

This is because LINQ methods accept IEnumerable<T> as a parameter. Consequently, they have no concept of the implementation of the original list. You could be passing, via extension method, a ReadOnlyCollection and they would still be able to evaluate through it. They cannot, then, alter the original collection, because the original collection could be written in any number of ways.

All that, but in table form. Each lines starts with the original { "a", "a", "b", "c" }:

Context     Example function    Immutable?    Returned Value     Collection after calling
Collection Add("d") No (void) { "a", "a", "b", "c", "d" }:
LINQ Skip(1) Yes { "a", "b", "c" } { "a", "a", "b", "c" }:

Use IEqualityComparer to generate a distinct list based on two properties in C#

Currently you run Distinct() on a collection of FamilySelector instances which results to comparing by reference equality.

To do it right, you should pass an instance of IEqualityComparer to Distinct() call:

var source = _mfgOrdersData
.Select(o => new FamilySelector(o.ItemWrapper.ItemClass, o.ItemWrapper.Family))
.Distinct(new FamilySelector())
.OrderBy(f => f.CodeFamily)
.ToList();

You should add parameterless constructor to FamilySelector class so that code could be compiled.

I'd also suggest small refactoring of FamilySelector class. Currently it holds the data and performs comparison. Usually implementation of IEqualityComparer is a data-less class that just performs a comparison:

class FamilyData
{
public FamilyData(string code, string family)
{
Code = code;
Family = family;
}

public string Code { get; set; }
public string Family { get; set; }
public string CodeFamily { get { return string.Format("{0}\t{1}", Code, Family); } }
}

class FamilySelector : IEqualityComparer<FamilyData>
{
public bool Equals(FamilyData x, FamilyData y)
{
return x.Code == y.Code && x.Family == y.Family;
}

public int GetHashCode(FamilyData obj)
{
return obj.Code.GetHashCode() ^ obj.Family.GetHashCode();
}
}

var source = _mfgOrdersData
.Select(o => new FamilyData(o.ItemWrapper.ItemClass, o.ItemWrapper.Family))
.Distinct(new FamilySelector())
.OrderBy(f => f.CodeFamily)
.ToList();

How to use IEqualityComparerT.Equals() in ToLookUpT() Extension

The article that you've linked to is completely misleading (and many of its comments highlight this).

GetHashCode is used where possible because it's fast; if there are hash collisions then Equals is used to disambiguate between the colliding items. So long as you implement Equals and GetHashCode correctly -- whether in the types themselves or a custom IEqualityComparer<T> implementation -- then there won't be any problems.

The problem with your example code is that you're not overriding Equals and GetHashCode at all. This means that the the default implementations are used, and the default implementations use reference comparisons for reference types, not value comparisons.

This means that you're not getting hash collisions because the objects you're comparing against are different to the original objects, even though they have the same values. This, in turn, means that Equals just isn't required by your example code. Override Equals and GetHashCode correctly, or set up an IEqualityComparer<T> to do so, and everything will start working as you expect.

What is the difference between using IEqualityComparer and Equals/GethashCode Override?

When you override Equals and GetHashCode you are changing the way the object will determine if it is equals to another. And a note, if you compare objects using == operator it will not have the same behavior as Equals unless you override the operator as well.

Doing that you changed the behavior for a single class, what if you need the same logic for other classes? If you need a "generic comparison". That is why you have IEqualityComparer.

Look at this example:

interface ICustom
{
int Key { get; set; }
}
class Custom : ICustom
{
public int Key { get; set; }
public int Value { get; set; }
}
class Another : ICustom
{
public int Key { get; set; }
}

class DicEqualityComparer : IEqualityComparer<ICustom>
{
public bool Equals(ICustom x, ICustom y)
{
return x.Key == y.Key;
}

public int GetHashCode(ICustom obj)
{
return obj.Key;
}
}

I have two different classes, both can use the same comparer.

var a = new Custom { Key = 1, Value = 2 };
var b = new Custom { Key = 1, Value = 2 };
var c = new Custom { Key = 2, Value = 2 };
var another = new Another { Key = 2 };

var d = new Dictionary<ICustom, string>(new DicEqualityComparer());

d.Add(a, "X");
// d.Add(b, "X"); // same key exception
d.Add(c, "X");
// d.Add(another, "X"); // same key exception

Notice that I didn't have to override Equals, GetHashCode in neither of the classes. I can use this comparer in any object that implements ICustom without having to rewrite the comparison logic. I can also make an IEqualityComparer for a "parent class" and use on classes that inherit. I can have comparer that will behave in a different way, I can make one to compare Value instead of Key.

So IEqualityComparer allows more flexibility and you can implement generic solutions.

Is there a built-in IEqualityComparerT for System.Int32?

There is no implementation of IEqualityComparer<int> in .NET Standard as you can determine by searching the API reference.

If you intend to implement the IEqualityComparer<T> yourself, it might be worth making a generic implementation with T constrained to IEquatable<T> so that you cover multitude of types instead of just int. All value types are recommended to implement IEquatable<T> as per Framework design guidelines.

Using GetHashCode of IEqualityComparer the right way

The correct answer:

When using LINQ the GetHashCode is called before the Equals method and only when the result of the GetHashCode is equal for two items in the collection.

So GetHashCode must be override too :

  public int GetHashCode(Car row)
{
return HashCode.Combine(row.Name, row.Color , row.Size);
}

See full response here

Using of IEqualityComparerT interface and EqualityComparerT class in C#

IEqualityComparer<in T> is an interface that will handle equality comparisons for your collection. Your collection will delegate equiality comparisons to this interface. You may ask, why don't just call the Equals method?

Because there can be several kinds of possible comparisons. Let's take an easy example: are "Abc" and "ABC" equal? It depends. "Abc".Equals("ABC") == false but what if you want case insensitivity?

This is why your collection should delegate the equality comparisons to a different class. By composing classes, you'll respect the single responsibility principle: Your collection knows how to store the items, and the equality comparer knows if they're equal.

An example with sets:

var caseSensitive = new HashSet<string>(StringComparer.Ordinal) // The default anyway
{
"Abc", "ABC"
};

var caseInsensitive = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
"Abc", "ABC"
};

The results will be:

caseSensitive.Count == 2
caseInsensitive.Count == 1

caseSensitive.Contains("aBc") == false
caseInsensitive.Contains("aBc") == true

Here you have two completely different set semantics using the same HashSet class.

Now, what's in an IEqualityComparer<in T>?

  • bool Equals(T x, T y);: This method does just what you'd expect it to: it returns true if x should be considered equal to y. Just like the mathematical equality, it must be:
    • reflexive: Equals(x, x) == true
    • symmetric: Equals(x, y) == Equals(y, x)
    • transitive: if Equals(x, y) && Equals(y, z) then Equals(x, z)
  • int GetHashCode(T obj); this one may be trickier to get right. For each obj, it should return a hash code wilth the following properties:
    • it should never change
    • if Equals(x, y) then GetHashCode(x) == GetHashCode(y)
    • there should be as few collisions as possible

Note that this does not imply that if GetHashCode(x) == GetHashCode(y) then Equals(x, y). Two objects can have the same hash code but be inequal (there can be at most 0xFFFFFFFF possible hash codes after all).

Collections often use the hash code to organize their items. For example, the HashSet will know that if two objects don't have the same hash code, they won't be equal and thus can organize its buckets accordingly. Hash codes are just an optimization.

Now, what's EqualityComparer<T>.Default? It's a conventient shortcut for an IEqualityComparer<T> that will use an object's own Equals and GetHashCode functions. This is a good default value as this is what you want to do most of the time: while strings can have multiple natural comparison types, this is not the case for integers for instance.

EqualityComparer<T>.Default will handle a couple special cases:

  • if T is IEquatable<T>, it will use the IEquatable<T> interface
  • if T is Nullable<U> and U is IEquatable<U> it will handle the case properly
  • it will optimize for a couple special cases: byte[] and int-basd Enum.


Related Topics



Leave a reply



Submit