Iequalitycomparer<T> That Uses Referenceequals

IEqualityComparerT that uses ReferenceEquals

Just in case there is no default implementation, this is my own:

Edit by 280Z28: Rationale for using RuntimeHelpers.GetHashCode(object), which many of you probably haven't seen before. :) This method has two effects that make it the correct call for this implementation:

It returns 0 when the object is null. Since ReferenceEquals works for null parameters, so should the comparer's implementation of GetHashCode().
It calls Object.GetHashCode() non-virtually. ReferenceEquals specifically ignores any overrides of Equals, so the implementation of GetHashCode() should use a special method that matches the effect of ReferenceEquals, which is exactly what RuntimeHelpers.GetHashCode is for.

[end 280Z28]

using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;

/// <summary>
/// A generic object comparerer that would only use object's reference, 
/// ignoring any <see cref="IEquatable{T}"/> or <see cref="object.Equals(object)"/>  overrides.
/// </summary>
public class ObjectReferenceEqualityComparer<T> : EqualityComparer<T>
    where T : class
{
    private static IEqualityComparer<T> _defaultComparer;

    public new static IEqualityComparer<T> Default
    {
        get { return _defaultComparer ?? (_defaultComparer = new ObjectReferenceEqualityComparer<T>()); }
    }

    #region IEqualityComparer<T> Members

    public override bool Equals(T x, T y)
    {
        return ReferenceEquals(x, y);
    }

    public override int GetHashCode(T obj)
    {
        return RuntimeHelpers.GetHashCode(obj);
    }

    #endregion
}

Preferring EqualityComparerT to IEqualityComparerT

Regarding your first question:

The remarks section for the IEqualityComparer<T> class doesn't really seem to be providing a reason for why you should prefer deriving from the abstract class over the interface, it sounds more like a reason why the equality comparer interface exists in the first place. What it says there is practically useless, it's basically describing what the default implementation is doing. If anything, the "reasoning" they've provided here sound more like a guideline of what your comparers could do and is irrelevant to what it actually does.

Looking at the public/protected interface of the EqualityComparer<T> class, there's only one redeeming quality, it implements the non-generic IEqualityComparer interface. I think what they meant to say that they recommend deriving from it because EqualityComparer<T> actually implements the non-generic IEqualityComparer interface that way your class may be used where the non-generic comparer is required.

It does make more sense in the remarks section for IComparer<T>:

We recommend that you derive from the Comparer<T> class instead of implementing the IComparer<T> interface, because the Comparer<T> class provides an explicit interface implementation of the IComparer.Compare method and the Default property that gets the default comparer for the object.

I suspect it was supposed to say something similar for IEqualityComparer<T> but some ideas were mixed up and ended up with an incomplete description.

Regarding your second question:

A primary goal for the collections found in the library was to be as flexible as possible. One way to get that is to allow custom ways of comparing objects within them by providing a IComparer<T> or IEqualityComparer<T> to do the comparisons. It would be much more easier to get an instance of a default comparer when one was not supplied than it is to do the comparisons directly. These comparers in turn could include the logic necessary to call the appropriate comparisons packaged nicely.

e.g., The default comparers can determine whether T implements IEquatable<T> and call IEquatable<T>.Equals on the object or otherwise use Object.Equals. Better encapsulated here in the comparer than it is potentially repeated in the collections code.

Besides, if they wanted to fall back on calling IEquatable<T>.Equals directly, they would have to add a constraint on T that would make this call possible. Doing so makes it less flexible and negates the benefits of providing the comparer in the first place.

generic reference equality comparer does not work for ValueTuple

Isn't ReferenceEquals the same as object.Equals

No, it is not. object.Equals will use virtual dispatch on the first operand, finding it's implementation of Equals for the actual runtime type of the object, and use whatever that type's definition says to do. In the case of ValueTuple, it'll compare the actual values of the two tuples. ReferenceEquals just compares the references and tells you if they're equal. In this particular case you have two different references, even though the value that each reference references are the same.

isn't ReferenceEquals used on two (object,object) the same as using ReferenceEquals both on the Item1s and Item2s and &&ing the results?

No, it's not. It's only going to tell you if the two objects you pass in are both the same reference to the same object. They won't inspect the actual values of those objects. In this case, you have two different references, so they're not equal.

And GetHashCode analogous?

It is analogous, insofar as the first version is using the ValueTuple implementation that computes the hash based on the value of the items within the tuple, while the second computes the hash entirely based on the reference to the object itself, so when you have two different references to two different objects, but where those objects but where those two objects have equivalent values internally, the first considers them equal, the second considers them unequal.

Implement IEqualityComparer

Try this:

var distinct = collection.Distinct(new MessageComparer());

Then use distinct for anything after that.

It looks like you're forgetting the immutable nature of IEnumerable<>. None of the LINQ methods actually change the original variable. Rather, they return IEnuerable<T>s which contain the result of the expression. For example, let's consider a simple List<string> original with the contents { "a", "a", "b", "c" }.

Now, let's call original.Add("d");. That method has no return value (it's void). But if we then print out the contents of original, we will see { "a", "a", "b", "c", "d" }.

On the other hand, let's now call original.Skip(1). This method does have a return value, one of type IEnumerable<string>. It is a LINQ expression, and performs no side-effecting actions on the original collection. Thus, if we call that and look at original, we will see { "a", "a", "b", "c", "d" }. However, the result from the method will be { "a", "b", "c", "d" }. As you can see, the result skips one element.

This is because LINQ methods accept IEnumerable<T> as a parameter. Consequently, they have no concept of the implementation of the original list. You could be passing, via extension method, a ReadOnlyCollection and they would still be able to evaluate through it. They cannot, then, alter the original collection, because the original collection could be written in any number of ways.

All that, but in table form. Each lines starts with the original { "a", "a", "b", "c" }:

Context     Example function    Immutable?    Returned Value     Collection after calling
Collection  Add("d")            No            (void)             { "a", "a", "b", "c", "d" }:
LINQ        Skip(1)             Yes           { "a", "b", "c" }  { "a", "a", "b", "c" }:

Is there any kind of ReferenceComparer in .NET?

As far as I know, the BCL doesn't expose any public types that implement IEqualityComparer<T> with reference-equality as of .NET 4.0 .

However, there do appear to be a bunch of internal types that do this, such as:

System.Dynamic.Utils.ReferenceEqualityComparer<T>
(in System.Core)
System.Xaml.Schema.ReferenceEqualityComparer<T>
(in System.Xaml).

I took a look at the implementations of these two types with reflector, and you'll be happy to know that both of them appear to be implemented in a way that is virtually identical to yours, except that they don't use lazy-initialization for the static instance (they create it in the static constructor for the type).

The only possible 'issue' I can think of with your implementation is that the lazy-initialization is not thread-safe, but since instances are 'cheap' and aren't holding onto any state, that shouldn't create any bugs or major performance problems. If you want to enforce the singleton-pattern though, you'll have to do it properly.

Questions about IEqualityComparerT / ListT.Distinct()

So it can use hashcodes to be O(n) as opposed to O(n²)
(A) is an optimization.

(B) is necessary; otherwise, it would throw an NullReferenceException.
If Invoice is a struct, however, they're both unnecessary and slower.
No. Hashcodes are not unique

IEqualityComparerT and custom type

It is your job to provide such GetHashCode() that the value it returns will be different for objects that are different (in as many cases as possible; you still may return same hash code for non-equal objects), and will always be same for objects that may be equal (in all cases; you may not return different hash code for equal objects).

For instance, if one of the three fields you compare is an int, you can return that field as GetHashCode().

If, however, it's difficult to come up with something clever, you can return a constant, such as 42. This way Equals() will be called for all object pairs, delivering expected results, although in the least performant way.

What is the proper way to set up a always false IEqualityComparerT?

If the type has not overridden the Equals or GetHashCode methods then their default implementations, from object, do what you want, namely provide equality based on their identity, rather than their value. You can use EqualityComparer<Person>.Default to get an IEqualityComparer that uses those semantics if you want.

If the Equals method has been overridden to provide some sort of value semantics, but you don't want that, you want identity semantics, then you can use object.ReferenceEquals in your own implementation:

public class IdentityComparer<T> : IEqualityComparer<T>
{
    public bool Equals(T x, T y)
    {
        return object.ReferenceEquals(x, y);
    }

    public int GetHashCode(T obj)
    {
        return System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode(obj);
    }
}

When should I use IEqualityComparer? C#

when should I use it?

Some possibilities:

When your definition of "equality" is more complicated than just comparing one property
When you want to pre-define "equality" for use in many queries
When you want to define "equality" outside of Linq, e.g. when using the class as the key to a hash table
When you want to tweak your definition of equality slightly without repeating code (i.e. turn on/off case sensitivity)

IEqualityComparer for Annoymous Type

Is there a way I can create my an IEquailityComparer for my anonymous types?

Sure. You just need to use type inference. For example, you could have something like:

public static class InferredEqualityComparer
{
    public static IEqualityComparer<T> Create<T>(
        IEnumerable<T> example,
        Func<T, T, bool> equalityCheck,
        Func<T, int> hashCodeProvider)
    {
        return new EqualityComparerImpl<T>(equalityCheck, hashCodeProvider);
    }

    private sealed class EqualityComparerImpl<T> : IEqualityComparer<T>
    {
        // Implement in the obvious way, remembering the delegates and
        // calling them appropriately.
    }
}

Then:

var glext = m_dtGLExt.AsEnumerable();
var query = from c in glext
            orderby ...
            select new { ... };
var comparer = InferredEqualityComparer.Create(query,
    (x, y) => { ... },
    o => { ... }
);
var distinct = query.Distinct(comparer);

Basically the first parameter to the method is just used for type inference, so that the compiler can work out what type to use for the lambda expression parameters.

You could create the comparer ahead of time by creating a sample of the anonymous type:

var sample = new[] { new { ... } };
var comparer = InferredExqualityComparer.Create(sample, ...);
var distinct = (... query here ... ).Distinct(comparer);

but then any time you change the query you've got to change the sample too.

Iequalitycomparer<T> That Uses Referenceequals