Why Must We Define Both == and != in C#

Why must we define both == and != in C#?

I can't speak for the language designers, but from what I can reason on, it seems like it was intentional, proper design decision.

Looking at this basic F# code, you can compile this into a working library. This is legal code for F#, and only overloads the equality operator, not the inequality:

module Module1

type Foo() =
    let mutable myInternalValue = 0
    member this.Prop
        with get () = myInternalValue
        and set (value) = myInternalValue <- value

    static member op_Equality (left : Foo, right : Foo) = left.Prop = right.Prop
    //static member op_Inequality (left : Foo, right : Foo) = left.Prop <> right.Prop

This does exactly what it looks like. It creates an equality comparer on == only, and checks to see if the internal values of the class are equal.

While you can't create a class like this in C#, you can use one that was compiled for .NET. It's obvious it will use our overloaded operator for == So, what does the runtime use for !=?

The C# EMCA standard has a whole bunch of rules (section 14.9) explaining how to determine which operator to use when evaluating equality. To put it overly-simplified and thus not perfectly accurate, if the types that are being compared are of the same type and there is an overloaded equality operator present, it will use that overload and not the standard reference equality operator inherited from Object. It is no surprise, then, that if only one of the operators is present, it will use the default reference equality operator, that all objects have, there is not an overload for it.¹

Knowing that this is the case, the real question is: Why was this designed in this way and why doesn't the compiler figure it out on its own? A lot people are saying this wasn't a design decision, but I like to think it was thought out this way, especially regarding the fact all objects have a default equality operator.

So, why doesn't the compiler automagically create the != operator? I can't know for sure unless someone from Microsoft confirms this, but this is what I can determine from reasoning on the facts.

To prevent unexpected behavior

Perhaps I want to do a value comparison on == to test equality. However, when it came to != I didn't care at all if the values were equal unless the reference was equal, because for my program to consider them equal, I only care if the references match. After all, this is actually outlined as default behavior of the C# (if both operators were not overloaded, as would be in case of some .net libraries written in another language). If the compiler was adding in code automatically, I could no longer rely on the compiler to output code that should is compliant. The compiler should not write hidden code that changes the behavior of yours, especially when the code you've written is within standards of both C# and the CLI.

In terms of it forcing you to overload it, instead of going to the default behavior, I can only firmly say that it is in the standard (EMCA-334 17.9.2)². The standard does not specify why. I believe this is due to the fact that C# borrows much behavior from C++. See below for more on this.

When you override `!=` and `==`, you do not have to return bool.

This is another likely reason. In C#, this function:

public static int operator ==(MyClass a, MyClass b) { return 0; }

is as valid as this one:

public static bool operator ==(MyClass a, MyClass b) { return true; }

If you're returning something other than bool, the compiler cannot automatically infer an opposite type. Furthermore, in the case where your operator does return bool, it just doesn't make sense for them create generate code that would only exist in that one specific case or, as I said above, code that hides the default behavior of the CLR.

C# borrows much from C++³

When C# was introduced, there was an article in MSDN magazine that wrote, talking about C#:

Many developers wish there was a language that was easy to write, read, and maintain like Visual Basic, but that still provided the power and flexibility of C++.

Yes the design goal for C# was to give nearly the same amount of power as C++, sacrificing only a little for conveniences like rigid type-safety and garbage-collection. C# was strongly modeled after C++.

You may not be surprised to learn that in C++, the equality operators do not have to return bool, as shown in this example program

Now, C++ does not directly require you to overload the complementary operator. If your compiled the code in the example program, you will see it runs with no errors. However, if you tried adding the line:

cout << (a != b);

you will get

compiler error C2678 (MSVC) : binary '!=' : no operator found which takes a left-hand operand of type 'Test' (or there is no acceptable conversion)`.

So, while C++ itself doesn't require you to overload in pairs, it will not let you use an equality operator that you haven't overloaded on a custom class. It's valid in .NET, because all objects have a default one; C++ does not.

^{1. As a side note, the C# standard still requires you to overload the pair of operators if you want to overload either one. This is a part of the standard and not simply the compiler. However, the same rules regarding the determination of which operator to call apply when you're accessing a .net library written in another language that doesn't have the same requirements.}

^{2. EMCA-334 (pdf) (http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-334.pdf)}

^{3. And Java, but that's really not the point here}

Do I have to define every single operator?

The point of overloading operators is to define how to add to manipulate objects of a custom type using those operators, so if your second field was a string array, how would you expect the ++ operator to be implemented automatically? There is no sensible answer, especially since we don't know the context of the object or it's usage, so the answer is yes, you do have to overload the operators yourself.

For the record, if you really do only need one field, and it's just a double, then don't use a struct in the first place unless you need to overload the operators to perform some other action than they do by default — it's a clear case of over-engineering!

Definition of == operator for Double

In reality, the compiler will turn the == operator into a ceq IL code, and the operator you mention will not be called.

The reason for the operator in the source code is likely so it can be called from languages other than C# that do not translate it into a CEQ call directly (or through reflection). The code within the operator will be compiled to a CEQ, so there is no infinite recursion.

In fact, if you call the operator via reflection, you can see that the operator is called (rather than a CEQ instruction), and obviously is not infinitely recursive (since the program terminates as expected):

double d1 = 1.1;
double d2 = 2.2;

MethodInfo mi = typeof(Double).GetMethod("op_Equality", BindingFlags.Static | BindingFlags.Public );

bool b = (bool)(mi.Invoke(null, new object[] {d1,d2}));

Resulting IL (compiled by LinqPad 4):

IL_0000:  nop         
IL_0001:  ldc.r8      9A 99 99 99 99 99 F1 3F 
IL_000A:  stloc.0     // d1
IL_000B:  ldc.r8      9A 99 99 99 99 99 01 40 
IL_0014:  stloc.1     // d2
IL_0015:  ldtoken     System.Double
IL_001A:  call        System.Type.GetTypeFromHandle
IL_001F:  ldstr       "op_Equality"
IL_0024:  ldc.i4.s    18 
IL_0026:  call        System.Type.GetMethod
IL_002B:  stloc.2     // mi
IL_002C:  ldloc.2     // mi
IL_002D:  ldnull      
IL_002E:  ldc.i4.2    
IL_002F:  newarr      System.Object
IL_0034:  stloc.s     04 // CS$0$0000
IL_0036:  ldloc.s     04 // CS$0$0000
IL_0038:  ldc.i4.0    
IL_0039:  ldloc.0     // d1
IL_003A:  box         System.Double
IL_003F:  stelem.ref  
IL_0040:  ldloc.s     04 // CS$0$0000
IL_0042:  ldc.i4.1    
IL_0043:  ldloc.1     // d2
IL_0044:  box         System.Double
IL_0049:  stelem.ref  
IL_004A:  ldloc.s     04 // CS$0$0000
IL_004C:  callvirt    System.Reflection.MethodBase.Invoke
IL_0051:  unbox.any   System.Boolean
IL_0056:  stloc.3     // b
IL_0057:  ret

Interestingly - the same operators do NOT exist (either in the reference source or via reflection) for integral types, only Single, Double, Decimal, String, and DateTime, which disproves my theory that they exist to be called from other languages. Obviously you can equate two integers in other languages without these operators, so we're back to the question "why do they exist for double"?

When would == be overridden in a different way to .equals?

My question is: why have them both (I realise there must be a very good reason)

If there's a good reason it has yet to be explained to me. Equality comparisons in C# are a godawful mess, and were #9 on my list of things I regret about the design of C#:

http://www.informit.com/articles/article.aspx?p=2425867

Mathematically, equality is the simplest equivalence relation and it should obey the rules: x == x should always be true, x == y should always be the same as y == x, x == y and x != y should always be opposite valued, if x == y and y == z are true then x == z must be true. C#'s == and Equals mechanisms guarantee none of these properties! (Though, thankfully, ReferenceEquals guarantees all of them.)

As Jon notes in his answer, == is dispatched based on the compile-time types of both operands, and .Equals(object) and .Equals(T) from IEquatable<T> are dispatched based on the runtime type of the left operand. Why are either of those dispatch mechanisms correct? Equality is not a predicate that favours its left hand side, so why should some but not all of the implementations do so?

Really what we want for user-defined equality is a multimethod, where the runtime types of both operands have equal weight, but that's not a concept that exists in C#.

Worse, it is incredibly common that Equals and == are given different semantics -- usually that one is reference equality and the other is value equality. There is no reason by which the naive developer would know which was which, or that they were different. This is a considerable source of bugs. And it only gets worse when you realize that GetHashCode and Equals must agree, but == need not.

Were I designing a new language from scratch, and I for some crazy reason wanted operator overloading -- which I don't -- then I would design a system that would be much, much more straightforward. Something like: if you implement IComparable<T> on a type then you automatically get <, <=, ==, !=, and so on, operators defined for you, and they are implemented so that they are consistent. That is x<=y must have the semantics of x<y || x==y and also the semantics of !(x>y), and that x == y is always the same as y == x, and so on.

Now, if your question really is:

How on earth did we get into this godawful mess?

Then I wrote down some thoughts on that back in 2009:

https://blogs.msdn.microsoft.com/ericlippert/2009/04/09/double-your-dispatch-double-your-fun/

The TLDR is: framework designers and language designers have different goals and different constraints, and they sometimes do not take those factors into account in their designs in order to ensure a consistent, logical experience across the platform. It's a failure of the design process.

When would == be overloaded in a different way to how .equals is overridden?

I would never do so unless I had a very unusual, very good reason. When I implement arithmetic types I always implement all of the operators to be consistent with each other.

5 ways for equality check in .net .. why? and which to use?

1 - Reference equals checks if two reference type variables(classes, not structs) are referred to the same memory adress.

2 - The virtual Equals() method checks if two objects are equivalent. Let us say that you have this class:

class TestClass{
    public int Property1{get;set}
    public int Property2{get;set}

    public override bool Equals(object obj)
    {
        if (obj.GetType() != typeof(TestClass))
            return false;

        var convertedObj = (TestClass)obj;

        return (convertedObj.Property1 == this.Property1 && convertedObj.Property2 == this.Property2);
    }
}

and you instantiate 2 objects from that class:

var o1 = new TestClass{property1 = 1, property2 = 2}
var o2 = new TestClass{property1 = 1, property2 = 2}

although the two objects are not the same instance of TestClass, the call to o1.Equals(o2) will return true.

3 - The static Equals method is used to handle problems when there is a null value in the check.
Imagine this, for instance:

TestClass o1 = null;
var o2 = new TestClass{property1 = 1, property2 = 2}

If you try this:

o1.Equals(o2);

you wil get a NullReferenceException, because o1 points to nothing.
To adress this issue, you do this:

Object.Equals(o1,o2);

This method is prepared to handle null references.

4 - The IEquatable interface is provided by .Net so you don't need to do casts inside your Equals method.
If the compiler finds out that you have implemented the interface in a class for the type you are trying to check for equality, it will give that method priority over the Object.Equals(Object) override.
For instance:

class TestClass : IEquatable<TestClass>
{
    public int Property1 { get; set; }
    public int Property2 { get; set; }

    public override bool Equals(object obj)
    {
        if (obj.GetType() != typeof(TestClass))
            return false;

        var convertedObj = (TestClass)obj;

        return (convertedObj.Property1 == this.Property1 && convertedObj.Property2 == this.Property2);
    }

    #region IEquatable<TestClass> Members

    public bool Equals(TestClass other)
    {
        return (other.Property1 == this.Property1 && other.Property2 == this.Property2);
    }

    #endregion
}

now if we do this:

var o1 = new TestClass{property1 = 1, property2 = 2}
var o2 = new TestClass{property1 = 1, property2 = 2}
o1.Equals(o2);

The called method is Equals(TestClass), prior to Equals(Object).

5 - The == operator usually means the same as ReferenceEquals, it checks if two variables point to the same memory adress.
The gotcha is that this operator can be overrided to perform other types of checks.
In strings, for instance, it checks if two different instances are equivalent.

This is a usefull link to understand equalities in .Net better:

CodeProject

Why Must We Define Both == and != in C#