How to use the IEqualityComparer
Your GetHashCode
implementation always returns the same value. Distinct
relies on a good hash function to work efficiently because it internally builds a hash table.
When implementing interfaces of classes it is important to read the documentation, to know which contract you’re supposed to implement.1
In your code, the solution is to forward GetHashCode
to Class_reglement.Numf.GetHashCode
and implement it appropriately there.
Apart from that, your Equals
method is full of unnecessary code. It could be rewritten as follows (same semantics, ¼ of the code, more readable):
public bool Equals(Class_reglement x, Class_reglement y)
{
return x.Numf == y.Numf;
}
Lastly, the ToList
call is unnecessary and time-consuming: AddRange
accepts any IEnumerable
so conversion to a List
isn’t required. AsEnumerable
is also redundant here since processing the result in AddRange
will cause this anyway.
1 Writing code without knowing what it actually does is called cargo cult programming. It’s a surprisingly widespread practice. It fundamentally doesn’t work.
When should I use IEqualityComparer? C#
when should I use it?
Some possibilities:
- When your definition of "equality" is more complicated than just comparing one property
- When you want to pre-define "equality" for use in many queries
- When you want to define "equality" outside of Linq, e.g. when using the class as the key to a hash table
- When you want to tweak your definition of equality slightly without repeating code (i.e. turn on/off case sensitivity)
Implement IEqualityComparer
Try this:
var distinct = collection.Distinct(new MessageComparer());
Then use distinct
for anything after that.
It looks like you're forgetting the immutable nature of IEnumerable<>
. None of the LINQ methods actually change the original variable. Rather, they return IEnuerable<T>
s which contain the result of the expression. For example, let's consider a simple List<string> original
with the contents { "a", "a", "b", "c" }
.
Now, let's call original.Add("d");
. That method has no return value (it's void
). But if we then print out the contents of original
, we will see { "a", "a", "b", "c", "d" }
.
On the other hand, let's now call original.Skip(1)
. This method does have a return value, one of type IEnumerable<string>
. It is a LINQ expression, and performs no side-effecting actions on the original collection. Thus, if we call that and look at original
, we will see { "a", "a", "b", "c", "d" }
. However, the result from the method will be { "a", "b", "c", "d" }
. As you can see, the result skips one element.
This is because LINQ methods accept IEnumerable<T>
as a parameter. Consequently, they have no concept of the implementation of the original list. You could be passing, via extension method, a ReadOnlyCollection
and they would still be able to evaluate through it. They cannot, then, alter the original collection, because the original collection could be written in any number of ways.
All that, but in table form. Each lines starts with the original { "a", "a", "b", "c" }
:
Context Example function Immutable? Returned Value Collection after calling
Collection Add("d") No (void) { "a", "a", "b", "c", "d" }:
LINQ Skip(1) Yes { "a", "b", "c" } { "a", "a", "b", "c" }:
Use IEqualityComparer to generate a distinct list based on two properties in C#
Currently you run Distinct()
on a collection of FamilySelector
instances which results to comparing by reference equality.
To do it right, you should pass an instance of IEqualityComparer
to Distinct()
call:
var source = _mfgOrdersData
.Select(o => new FamilySelector(o.ItemWrapper.ItemClass, o.ItemWrapper.Family))
.Distinct(new FamilySelector())
.OrderBy(f => f.CodeFamily)
.ToList();
You should add parameterless constructor to FamilySelector
class so that code could be compiled.
I'd also suggest small refactoring of FamilySelector
class. Currently it holds the data and performs comparison. Usually implementation of IEqualityComparer is a data-less class that just performs a comparison:
class FamilyData
{
public FamilyData(string code, string family)
{
Code = code;
Family = family;
}
public string Code { get; set; }
public string Family { get; set; }
public string CodeFamily { get { return string.Format("{0}\t{1}", Code, Family); } }
}
class FamilySelector : IEqualityComparer<FamilyData>
{
public bool Equals(FamilyData x, FamilyData y)
{
return x.Code == y.Code && x.Family == y.Family;
}
public int GetHashCode(FamilyData obj)
{
return obj.Code.GetHashCode() ^ obj.Family.GetHashCode();
}
}
var source = _mfgOrdersData
.Select(o => new FamilyData(o.ItemWrapper.ItemClass, o.ItemWrapper.Family))
.Distinct(new FamilySelector())
.OrderBy(f => f.CodeFamily)
.ToList();
How to use IEqualityComparerT.Equals() in ToLookUpT() Extension
The article that you've linked to is completely misleading (and many of its comments highlight this).
GetHashCode
is used where possible because it's fast; if there are hash collisions then Equals
is used to disambiguate between the colliding items. So long as you implement Equals
and GetHashCode
correctly -- whether in the types themselves or a custom IEqualityComparer<T>
implementation -- then there won't be any problems.
The problem with your example code is that you're not overriding Equals
and GetHashCode
at all. This means that the the default implementations are used, and the default implementations use reference comparisons for reference types, not value comparisons.
This means that you're not getting hash collisions because the objects you're comparing against are different to the original objects, even though they have the same values. This, in turn, means that Equals
just isn't required by your example code. Override Equals
and GetHashCode
correctly, or set up an IEqualityComparer<T>
to do so, and everything will start working as you expect.
What is the difference between using IEqualityComparer and Equals/GethashCode Override?
When you override Equals
and GetHashCode
you are changing the way the object will determine if it is equals to another. And a note, if you compare objects using ==
operator it will not have the same behavior as Equals
unless you override the operator as well.
Doing that you changed the behavior for a single class, what if you need the same logic for other classes? If you need a "generic comparison". That is why you have IEqualityComparer
.
Look at this example:
interface ICustom
{
int Key { get; set; }
}
class Custom : ICustom
{
public int Key { get; set; }
public int Value { get; set; }
}
class Another : ICustom
{
public int Key { get; set; }
}
class DicEqualityComparer : IEqualityComparer<ICustom>
{
public bool Equals(ICustom x, ICustom y)
{
return x.Key == y.Key;
}
public int GetHashCode(ICustom obj)
{
return obj.Key;
}
}
I have two different classes, both can use the same comparer.
var a = new Custom { Key = 1, Value = 2 };
var b = new Custom { Key = 1, Value = 2 };
var c = new Custom { Key = 2, Value = 2 };
var another = new Another { Key = 2 };
var d = new Dictionary<ICustom, string>(new DicEqualityComparer());
d.Add(a, "X");
// d.Add(b, "X"); // same key exception
d.Add(c, "X");
// d.Add(another, "X"); // same key exception
Notice that I didn't have to override Equals
, GetHashCode
in neither of the classes. I can use this comparer in any object that implements ICustom
without having to rewrite the comparison logic. I can also make an IEqualityComparer
for a "parent class" and use on classes that inherit. I can have comparer that will behave in a different way, I can make one to compare Value
instead of Key
.
So IEqualityComparer
allows more flexibility and you can implement generic solutions.
Is there a built-in IEqualityComparerT for System.Int32?
There is no implementation of IEqualityComparer<int>
in .NET Standard as you can determine by searching the API reference.
If you intend to implement the IEqualityComparer<T>
yourself, it might be worth making a generic implementation with T
constrained to IEquatable<T>
so that you cover multitude of types instead of just int
. All value types are recommended to implement IEquatable<T>
as per Framework design guidelines.
Using GetHashCode of IEqualityComparer the right way
The correct answer:
When using LINQ the GetHashCode
is called before the Equals
method and only when the result of the GetHashCode
is equal for two items in the collection.
So GetHashCode
must be override too :
public int GetHashCode(Car row)
{
return HashCode.Combine(row.Name, row.Color , row.Size);
}
See full response here
Using of IEqualityComparerT interface and EqualityComparerT class in C#
IEqualityComparer<in T>
is an interface that will handle equality comparisons for your collection. Your collection will delegate equiality comparisons to this interface. You may ask, why don't just call the Equals
method?
Because there can be several kinds of possible comparisons. Let's take an easy example: are "Abc"
and "ABC"
equal? It depends. "Abc".Equals("ABC") == false
but what if you want case insensitivity?
This is why your collection should delegate the equality comparisons to a different class. By composing classes, you'll respect the single responsibility principle: Your collection knows how to store the items, and the equality comparer knows if they're equal.
An example with sets:
var caseSensitive = new HashSet<string>(StringComparer.Ordinal) // The default anyway
{
"Abc", "ABC"
};
var caseInsensitive = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
"Abc", "ABC"
};
The results will be:
caseSensitive.Count == 2
caseInsensitive.Count == 1
caseSensitive.Contains("aBc") == false
caseInsensitive.Contains("aBc") == true
Here you have two completely different set semantics using the same HashSet
class.
Now, what's in an IEqualityComparer<in T>
?
bool Equals(T x, T y);
: This method does just what you'd expect it to: it returnstrue
ifx
should be considered equal toy
. Just like the mathematical equality, it must be:- reflexive:
Equals(x, x) == true
- symmetric:
Equals(x, y) == Equals(y, x)
- transitive: if
Equals(x, y) && Equals(y, z)
thenEquals(x, z)
- reflexive:
int GetHashCode(T obj);
this one may be trickier to get right. For eachobj
, it should return a hash code wilth the following properties:- it should never change
- if
Equals(x, y)
thenGetHashCode(x) == GetHashCode(y)
- there should be as few collisions as possible
Note that this does not imply that if GetHashCode(x) == GetHashCode(y)
then Equals(x, y)
. Two objects can have the same hash code but be inequal (there can be at most 0xFFFFFFFF
possible hash codes after all).
Collections often use the hash code to organize their items. For example, the HashSet
will know that if two objects don't have the same hash code, they won't be equal and thus can organize its buckets accordingly. Hash codes are just an optimization.
Now, what's EqualityComparer<T>.Default
? It's a conventient shortcut for an IEqualityComparer<T>
that will use an object's own Equals
and GetHashCode
functions. This is a good default value as this is what you want to do most of the time: while strings can have multiple natural comparison types, this is not the case for integers for instance.
EqualityComparer<T>.Default
will handle a couple special cases:
- if
T is IEquatable<T>
, it will use theIEquatable<T>
interface - if
T is Nullable<U>
andU is IEquatable<U>
it will handle the case properly - it will optimize for a couple special cases:
byte[]
and int-basdEnum
.
Related Topics
Is There a Faster Way to Scan Through a Directory Recursively in .Net
How to Get First Record in Each Group Using Linq
Injecting Dependencies into ASP.NET MVC 3 Action Filters. What's Wrong with This Approach
Linq to SQL: Multiple Joins on Multiple Columns. Is This Possible
C# - Code to Order by a Property Using the Property Name as a String
Try/Catch + Using, Right Syntax
How to Get My C# Program to Sleep for 50 Msec
How to Use Jwt in MVC Application for Authentication and Authorization
Difference Between Casting and Using the Convert.To() Method
Getting All File Names from a Folder Using C#
Is There an Alternative to String.Replace That Is Case-Insensitive
Is There a C# Case Insensitive Equals Operator
How to Get a JSON String from Url
How to Write Fast Colored Output to Console
Custom Numeric Format String to Always Display the Sign