How Does Java's Use-Site Variance Compare to C#'s Declaration Site Variance

What are the kinds of covariance in C#? (Or, covariance: by example)

Here's what I can think of:

Update

After reading the constructive comments and the ton of articles pointed (and written) by Eric Lippert, I improved the answer:

Updated the broken-ness of array covariance
Added "pure" delegate variance
Added more examples from the BCL
Added links to articles that explain the concepts in-depth.
Added a whole new section on higher-order function parameter covariance.

Return type covariance:

Available in Java (>= 5)^[1] and C++^[2], not supported in C# (Eric Lippert explains why not and what you can do about it):

class B {
    B Clone();
}

class D: B {
    D Clone();
}

Interface covariance^[3] - supported in C#

The BCL defines the generic IEnumerable interface to be covariant:

IEnumerable<out T> {...}

Thus the following example is valid:

class Animal {}
class Cat : Animal {}

IEnumerable<Cat> cats = ...
IEnumerable<Animal> animals = cats;

Note that an IEnumerable is by definition "read-only" - you can't add elements to it.

Contrast that to the definition of IList<T> which can be modified e.g. using .Add():

public interface IEnumerable<out T> : ...  //covariant - notice the 'out' keyword
public interface IList<T> : ...            //invariant

Delegate covariance by means of method groups ^[4] - supported in C#

class Animal {}
class Cat : Animal {}

class Prog {
    public delegate Animal AnimalHandler();

    public static Animal GetAnimal(){...}
    public static Cat GetCat(){...}

    AnimalHandler animalHandler = GetAnimal;
    AnimalHandler catHandler = GetCat;        //covariance

}

"Pure" delegate covariance^{[5 - pre-variance-release article]} - supported in C#

The BCL definition of a delegate that takes no parameters and returns something is covariant:

public delegate TResult Func<out TResult>()

This allows the following:

Func<Cat> getCat = () => new Cat();
Func<Animal> getAnimal = getCat;

Array covariance - supported in C#, in a broken way^[6] ^[7]

string[] strArray = new[] {"aa", "bb"};

object[] objArray = strArray;    //covariance: so far, so good
//objArray really is an "alias" for strArray (or a pointer, if you wish)

//i can haz cat?
object cat == new Cat();         //a real cat would object to being... objectified.

//now assign it
objArray[1] = cat                //crash, boom, bang
                                 //throws ArrayTypeMismatchException

And finally - the surprising and somewhat mind-bending

Delegate parameter covariance (yes, that's co-variance) - for higher-order functions.^[8]

The BCL definition of the delegate that takes one parameter and returns nothing is contravariant:

public delegate void Action<in T>(T obj)

Bear with me. Let's define a circus animal trainer - he can be told how to train an animal (by giving him an Action that works with that animal).

delegate void Trainer<out T>(Action<T> trainingAction);

We have the trainer definition, let's get a trainer and put him to work.

Trainer<Cat> catTrainer = (catAction) => catAction(new Cat());

Trainer<Animal> animalTrainer = catTrainer;  
// covariant: Animal > Cat => Trainer<Animal> > Trainer<Cat> 

//define a default training method
Action<Animal> trainAnimal = (animal) => 
   { 
   Console.WriteLine("Training " + animal.GetType().Name + " to ignore you... done!"); 
   };

//work it!
animalTrainer(trainAnimal);

The output proves that this works:

Training Cat to ignore you... done!

In order to understand this, a joke is in order.

A linguistics professor was lecturing to his class one day.

"In English," he said, "a double negative forms a positive.

However," he pointed out, "there is no language wherein a double positive can form a negative."

A voice from the back of the room piped up, "Yeah, right."

What's that got to do with covariance?!

Let me attempt a back-of-the-napkin demonstration.

An Action<T> is contravariant, i.e. it "flips" the types' relationship:

A < B => Action<A> > Action<B> (1)

Change A and B above with Action<A> and Action<B> and get:

Action<A> < Action<B> => Action<Action<A>> > Action<Action<B>>  

or (flip both relationships)

Action<A> > Action<B> => Action<Action<A>> < Action<Action<B>> (2)

Put (1) and (2) together and we have:

,-------------(1)--------------.
 A < B => Action<A> > Action<B> => Action<Action<A>> < Action<Action<B>> (4)
         `-------------------------------(2)----------------------------'

But our Trainer<T> delegate is effectively an Action<Action<T>>:

Trainer<T> == Action<Action<T>> (3)

So we can rewrite (4) as:

A < B => ... => Trainer<A> < Trainer<B>

- which, by definition, means Trainer is covariant.

In short, applying Action twice we get contra-contra-variance, i.e. the relationship between types is flipped twice (see (4) ), so we're back to covariance.

Covariance and Contravariance with Func in generics

I need more information about variance in generics and delegates.

I wrote an extensive series of blog articles on this feature. Though some of it is out of date -- since it was written before the design was finalized -- there's lots of good information there. In particular if you need a formal definition of what variance validity is, you should carefully read this:

https://blogs.msdn.microsoft.com/ericlippert/2009/12/03/exact-rules-for-variance-validity/

See my other articles on my MSDN and WordPress blogs for related topics.

Why the compiler complains about TIn being contravariant and TOut - covariant while the Func expects exactly the same variance?

Let's slightly rewrite your code and see:

public delegate R F<in T, out R> (T arg);
public interface I<in A, out B>{
  B M(F<A, B> f);
}

The compiler must prove that this is safe, but it is not.

We can illustrate that it is not safe by supposing that it is, and then discovering how it can be abused.

Let's suppose we have an Animal hierarchy with the obvious relationships, eg, Mammal is an Animal, Giraffe is a Mammal, and so on. And let's suppose that your variance annotations are legal. We should be able to say:

class C : I<Mammal, Mammal>
{
  public Mammal M(F<Mammal, Mammal> f) {
    return f(new Giraffe());
  }
}

I hope you agree this is a perfectly valid implementation. Now we can do this:

I<Tiger, Animal> i = new C();

C implements I<Mammal, Mammal>, and we've said that the first one can get more specific, and the second can get more general, so we've done that.

Now we can do this:

Func<Tiger, Animal> f = (Tiger t) => new Lizard();

That's a perfectly legal lambda for this delegate, and it matches the signature of:

i.M(f);

And what happens? C.M is expecting a function that takes a giraffe and returns a mammal, but it's been given a function that takes a tiger and returns a lizard, so someone is going to have a very bad day.

Plainly this must not be allowed to happen, but every step along the way was legal. We must conclude that the variance itself was not provably safe, and indeed, it was not. The compiler is right to reject this.

Getting variance right takes more than simply matching the in and out annotations. You've got to do so in a manner that does not allow this sort of defect to exist.

That explains why this is illegal. To explain how it is illegal, the compiler must check that the following is true of B M(F<A, B> f);:

B is valid covariantly. Since it is declared "out", it is.
F<A, B> is valid contravariantly. It is not. The relevant portion of the definition of "valid contravariantly" for a generic delegate is: If the ith type parameter was declared as contravariant, then Ti must be valid covariantly. OK. The first type parameter, T, was declared as contravariant. Therefore the first type argument A must be valid covariantly. But it is not valid covariantly, because it was declared contravariant. And that's the error you're getting. Similarly, B is also bad because it must be valid contravariantly, but B is covariant. The compiler does not go on to find additional errors after it finds the first problem here; I considered it but rejected it as being a too-complex error message.

I note also that you would still have this problem even if the delegate were not variant; nowhere in my counterexample did we use the fact that F is variant in its type parameters. A similar error would be reported if we tried

public delegate R F<T, R> (T arg);

instead.

ref and out parameters in C# and cannot be marked as variant

"out" means, roughly speaking, "only appears in output positions".

"in" means, roughly speaking, "only appears in input positions".

The real story is a bit more complicated than that, but the keywords were chosen because most of the time this is the case.

Consider a method of an interface or the method represented by a delegate:

delegate void Foo</*???*/ T>(ref T item);

Does T appear in an input position? Yes. The caller can pass a value of T in via item; the callee Foo can read that. Therefore T cannot be marked "out".

Does T appear in an output position? Yes. The callee can write a new value to item, which the caller can then read. Therefore T cannot be marked "in".

Therefore if T appears in a "ref" formal parameter, T cannot be marked as either in or out.

Let's look at some real examples of how things go wrong. Suppose this were legal:

delegate void X<out T>(ref T item);
...
X<Dog> x1 = (ref Dog d)=>{ d.Bark(); }
X<Animal> x2 = x1; // covariant;
Animal a = new Cat();
x2(ref a);

Well dog my cats, we just made a cat bark. "out" cannot be legal.

What about "in"?

delegate void X<in T>(ref T item);
...
X<Animal> x1 = (ref Animal a)=>{ a = new Cat(); }
X<Dog> x2 = x1; // contravariant;
Dog d = new Dog();
x2(ref d);

And we just put a cat in a variable that can only hold dogs. T cannot be marked "in" either.

What about an out parameter?

delegate void Foo</*???*/T>(out T item);

? Now T only appears in an output position. Should it be legal to make T marked as "out"?

Unfortunately no. "out" actually is not different than "ref" behind the scenes. The only difference between "out" and "ref" is that the compiler forbids reading from an out parameter before it is assigned by the callee, and that the compiler requires assignment before the callee returns normally. Someone who wrote an implementation of this interface in a .NET language other than C# would be able to read from the item before it was initialized, and therefore it could be used as an input. We therefore forbid marking T as "out" in this case. That's regrettable, but nothing we can do about it; we have to obey the type safety rules of the CLR.

Furthermore, the rule of "out" parameters is that they cannot be used for input before they are written to. There is no rule that they cannot be used for input after they are written to. Suppose we allowed

delegate void X<out T>(out T item);
class C
{
    Animal a;
    void M()
    {
        X<Dog> x1 = (out Dog d) => 
        { 
             d = null; 
             N(); 
             if (d != null) 
               d.Bark(); 
        };
        x<Animal> x2 = x1; // Suppose this were legal covariance.
        x2(out this.a);
    }
    void N() 
    { 
        if (this.a == null) 
            this.a = new Cat(); 
    }
}

Once more we have made a cat bark. We cannot allow T to be "out".

It is very foolish to use out parameters for input in this way, but legal.

UPDATE: C# 7 has added in as a formal parameter declaration, which means that we now have both in and out meaning two things; this is going to create some confusion. Let me clear that up:

in, out and ref on a formal parameter declaration in a parameter list means "this parameter is an alias to a variable supplied by the caller".
ref means "the callee may read or write the aliased variable, and it must be known to be assigned before the call.
out means "the callee must write the aliased variable via the alias before it returns normally". It also means that the callee must not read the aliased variable via the alias before it writes it, because the variable might not be definitely assigned.
in means "the callee may read the aliased variable but does not write to it via the alias". The purpose of in is to solve a rare performance problem, whereby a large struct must be passed "by value" but it is expensive to do so. As an implementation detail, in parameters are typically passed via a pointer-sized value, which is faster than copying by value, but slower on the dereference.
From the CLR's perspective, in, out and ref are all the same thing; the rules about who reads and writes what variables at what times, the CLR does not know or care.
Since it is the CLR that enforces rules about variance, rules that apply to ref also apply to in and out parameters.

In contrast, in and out on type parameter declarations mean "this type parameter must not be used in a covariant manner" and "this type parameter must not be used in a contravariant manner", respectively.

As noted above, we chose in and out for those modifiers because if we see IFoo<in T, out U> then T is used in "input" positions and U is used in "output" positions. Though that is not strictly true, it is true enough in the 99.9% use case that it is a helpful mnemonic.

It is unfortunate that interface IFoo<in T, out U> { void Foo(in T t, out U u); } is illegal because it looks like it ought to work. It cannot work because from the CLR verifier's perspective, those are both ref parameters and therefore read-write.

This is just one of those weird, unintended situations where two features that logically ought to work together do not work well together for implementation detail reasons.

Covariance and contravariance real world example

Let's say you have a class Person and a class that derives from it, Teacher. You have some operations that take an IEnumerable<Person> as the argument. In your School class you have a method that returns an IEnumerable<Teacher>. Covariance allows you to directly use that result for the methods that take an IEnumerable<Person>, substituting a more derived type for a less derived (more generic) type. Contravariance, counter-intuitively, allows you to use a more generic type, where a more derived type is specified.

See also Covariance and Contravariance in Generics on MSDN.

Classes:

public class Person 
{
     public string Name { get; set; }
} 

public class Teacher : Person { } 

public class MailingList
{
    public void Add(IEnumerable<out Person> people) { ... }
}

public class School
{
    public IEnumerable<Teacher> GetTeachers() { ... }
}

public class PersonNameComparer : IComparer<Person>
{
    public int Compare(Person a, Person b) 
    { 
        if (a == null) return b == null ? 0 : -1;
        return b == null ? 1 : Compare(a,b);
    }

    private int Compare(string a, string b)
    {
        if (a == null) return b == null ? 0 : -1;
        return b == null ? 1 : a.CompareTo(b);
    }
}

Usage:

var teachers = school.GetTeachers();
var mailingList = new MailingList();

// Add() is covariant, we can use a more derived type
mailingList.Add(teachers);

// the Set<T> constructor uses a contravariant interface, IComparer<in T>,
// we can use a more generic type than required.
// See https://msdn.microsoft.com/en-us/library/8ehhxeaf.aspx for declaration syntax
var teacherSet = new SortedSet<Teachers>(teachers, new PersonNameComparer());

C# generics - without lower bounds by design?

A complicated question.

First let's consider your fundamental question, "why is this illegal in C#?"

class C<T> where T : Mammal {} // legal
class D<T> where Giraffe : T {} // illegal

That is, a generic type constraint can say "T must be any reference type that could be assigned to a variable of type Mammal", but not "T must be any reference type, a variable of which could be assigned a Giraffe". Why the difference?

I don't know. That was long before my time on the C# team. The trivial answer is "because the CLR doesn't support it", but the team that designed C# generics was the same team that designed CLR generics, so that's really not much of an explanation.

My guess would be simply that as always, to be supported a feature has to be designed, implemented, tested, documented and shipped to customers; no one ever did any of those things for this feature, and therefore it is not in the language. I don't see a large, compelling benefit to the proposed feature; complicated features with no compelling benefits tend to be cut around here.

However, that's a guess. Next time I happen to chat with the guys who worked on generics -- they live in England, so its not like they're just down the hall from me, unfortunately -- I'll ask.

As for your specific example, I think Paul is correct. You do not need lower bound constraints to make that work in C#. You could say:

void Copy<T, U>(Collection<T> src, Collection<U> dst) where T : U 
{ 
    foreach(T item in src) dst.Add(item);
}

That is, put the constraint on T, not on U.

C# return type covariance and Liskov substitution principle

C# can still apply the Liskov substitution principle.

Consider:

public class Base1
{
}

public class Derived1 : Base1
{
}

public class Base2
{
    public virtual Base1 Method()
    {
        return new Base1();
    }
}

public class Derived2 : Base2
{
    public override Base1 Method()
    {
        return new Derived1();
    }
}

If C# supported covariant return types, then the override for Method() in Base2 could be declared thusly:

public class Derived2 : Base2
{
    public override Derived1 Method()
    {
        return new Derived1();
    }
}

C# does not allow this, and you must declare the return type the same as it is in the base class, namely Base1.

However, doing so does not make it violate the Liskov substitution principle.

Consider this:

Base2 test = new Base2();
Base1 item = test.Method();

Compared to:

Base2 test = new Derived2();
Base1 item = test.Method();

We are completely able to replace new Base2() with new Derived2() with no issues. This complies with the Liskov substitution principle.

Comparing double values in C#

It's a standard problem due to how the computer stores floating point values. Search here for "floating point problem" and you'll find tons of information.

In short – a float/double can't store 0.1 precisely. It will always be a little off.

You can try using the decimal type which stores numbers in decimal notation. Thus 0.1 will be representable precisely.

You wanted to know the reason:

Float/double are stored as binary fractions, not decimal fractions. To illustrate:

12.34 in decimal notation (what we use) means

1 * 10¹ + 2 * 10⁰ + 3 * 10^-1 + 4 * 10^-2

The computer stores floating point numbers in the same way, except it uses base 2: 10.01 means

1 * 2¹ + 0 * 2⁰ + 0 * 2^-1 + 1 * 2^-2

Now, you probably know that there are some numbers that cannot be represented fully with our decimal notation. For example, 1/3 in decimal notation is 0.3333333…. The same thing happens in binary notation, except that the numbers that cannot be represented precisely are different. Among them is the number 1/10. In binary notation that is 0.000110011001100….

Since the binary notation cannot store it precisely, it is stored in a rounded-off way. Hence your problem.

C# (.NET) Design Flaws

I agree emphatically with this post (for those poo-pooing the lack of ToString, there is a debugger attribute to provide a custom format for your class).

On top of the above list, I would also add the following reasonable requests:

non-nullable reference types as a complement to nullable value types,
allow overriding a struct's empty constructor,
allow generic type constraints to specify sealed classes,
I agree with another poster here that requested arbitrary constructor signatures when used as constraints, ie. where T : new(string), or where T : new(string, int)
I also agree with another poster here about fixing events, both for empty event lists and in the concurrent setting (though the latter is tricky),
operators should be defined as extension methods, and not as static methods of the class (or not just as static methods at least),
allow static properties and methods for interfaces (Java has this, but C# does not),
allow event initialization in object initializers (only fields and properties are currently allowed),
why is the "object initializer" syntax only usable when creating an object? Why not make it available at any time, ie. var e = new Foo(); e { Bar = baz };
fix quadratic enumerable behaviour,
all collections should have immutable snapshots for iteration (ie. mutating the collection should not invalidate the iterator),
tuples are easy to add, but an efficient closed algebraic type like "Either<T>" is not, so I'd love some way to declare a closed algebraic type and enforce exhaustive pattern matching on it (basically first-class support for the visitor pattern, but far more efficient); so just take enums, extend them with exhaustive pattern matching support, and don't allow invalid cases,
I'd love support for pattern matching in general, but at the very least for object type testing; I also kinda like the switch syntax proposed in another post here,
I agree with another post that the System.IO classes, like Stream, are somewhat poorly designed; any interface that requires some implementations to throw NotSupportedException is a bad design,
IList should be much simpler than it is; in fact, this may be true for many of the concrete collection interfaces, like ICollection,
too many methods throw exceptions, like IDictionary for instance,
I would prefer a form of checked exceptions better than that available in Java (see the research on type and effect systems for how this can be done),
fix various annoying corner cases in generic method overload resolution; for instance, try providing two overloaded extension methods, one that operates on reference types, and the other on nullable struct types, and see how your type inference likes that,
provide a way to safely reflect on field and member names for interfaces like INotifyPropertyChanged, that take the field name as a string; you can do this by using an extension method that takes a lambda with a MemberExpression, ie. () => Foo, but that's not very efficient,
- Update: C# 6.0 added the nameof() operator for single member names, but it doesn't work in generics (nameof(T) == "T" instead of the actual type-argument's name: you still need to do typeof(T).Name)) - nor does it allow you to get a "path" string, e.g. nameof(this.ComplexProperty.Value) == "Value" limiting its possible applications.
allow operators in interfaces, and make all core number types implement IArithmetic; other useful shared operator interfaces are possible as well,
make it harder to mutate object fields/properties, or at the very least, allow annotating immutable fields and make the type checker enforce it (just treat it as getter-only property fer chrissakes, it's not hard!); in fact, unify fields and properties in a more sensible way since there's no point in having both; C# 3.0's automatic properties are a first step in this direction, but they don't go far enough,
- Update: While C# had the readonly keyword, and C# 6.0 added read-only auto-properties, though it isn't as stringent as true language support for immutable types and values.
simplify declaring constructors; I like F#'s approach, but the other post here that requires simply "new" instead of the class name is better at least,

That's enough for now I suppose. These are all irritations I've run into in the past week. I could probably go on for hours if I really put my mind to it. C# 4.0 is already adding named, optional and default arguments, which I emphatically approve of.

Now for one unreasonable request:

it'd be really, really nice if C#/CLR could support type constructor polymorphism, ie. generics over generics,

Pretty please? :-)

Collection of generic types

Have your generic class inherit from a non-generic base, or implement a non-generic interface. Then you can have a collection of this type and cast within whatever code you use to access the collection's contents.

Here's an example.

public abstract class MyClass
{
    public abstract Type Type { get; }
}

public class MyClass<T> : MyClass
{
    public override Type Type
    {
        get { return typeof(T); }
    }

    public T Value { get; set; }
}

// VERY basic illustration of how you might construct a collection
// of MyClass<T> objects.
public class MyClassCollection
{
    private Dictionary<Type, MyClass> _dictionary;

    public MyClassCollection()
    {
        _dictionary = new Dictionary<Type, MyClass>();
    }

    public void Put<T>(MyClass<T> item)
    {
        _dictionary[typeof(T)] = item;
    }

    public MyClass<T> Get<T>()
    {
        return _dictionary[typeof(T)] as MyClass<T>;
    }
}