Why Does C# (4.0) Not Allow Co- and Contravariance in Generic Class Types

Why does C# (4.0) not allow co- and contravariance in generic class types?

First off, as Tomas says, it is not supported in the CLR.

Second, how would that work? Suppose you have

class C<out T>
{ ... how are you planning on using T in here? ... }

T can only be used in output positions. As you note, the class cannot have any field of type T because the field could be written to. The class cannot have any methods that take a T, because those are logically writes. Suppose you had this feature -- how would you take advantage of it?

This would be useful for immutable classes if we could, say, make it legal to have a readonly field of type T; that way we'd massively cut down on the likelihood that it be improperly written to. But it's quite difficult to come up with other scenarios that permit variance in a typesafe manner.

If you have such a scenario, I'd love to see it. That would be points towards someday getting this implemented in the CLR.

UPDATE: See

Why isn't there generic variance for classes in C# 4.0?

for more on this question.

Why isn't there generic variance for classes in C# 4.0?

Suppose you had a class C<T> that was covariant in T. What might its implementation look like? T has to be out only. That means that C<T> cannot have any method that takes a T, any property of type T with a setter, or any field of type T, because fields are logically the same as property setters; T goes in.

Pretty much the only useful thing you could build with a covariant class is something immutable as far as T is concerned. Now, I think it would be awesome to have covariant immutable lists and stacks and whatnot that were class types. But that feature is not so obviously awesome that it would clearly justify the massive expenditure in making the type system natively support covariant immutable class types.

A comment above asked for an example of where this would be useful. Consider the following sketch:

sealed class Stack<out T>
{
private readonly T head;
private readonly Stack<T> tail;
public T Peek() { return head; }
public Stack<T> Pop() { return tail; }
public Stack(T head, Stack<T> tail)
{
this.tail = tail;
this.head = head;
}
}
static class StackExtensions
{
public static Stack<T> Push<T>(this Stack<T> tail, T head)
{
return new Stack<T>(head, tail);
}
public static bool IsEmpty<T>(this Stack<T> stack)
{
return stack == null;
}
}

Suppose you had covariant classes. Now you can say

Stack<string> strings = null;
strings = strings.Push("hello");
strings = strings.Push("goodbye");
Stack<object> objects = strings;
objects = objects.Push(123);

And hey, we just pushed an integer onto a stack of strings, but everything worked out just fine! There's no reason why this couldn't be typesafe. An operation which would violate type safety on a mutable data structure can be safely covariant on an immutable data structure.

Why is C# 4.0's covariance/contravariance limited to parameterized interface and delegate types?

Simple answer: it's a CLR limitation.

(I haven't seen a good, concrete explanation for this anywhere... I don't remember seeing one in Eric's blog series about it, although I may well have missed it somewhere.)

One thing I would say is that both delegates and interfaces already form "layers of indirection" over the real types; views on methods or classes, if you will. Changing from one view to another view is fairly reasonable. The actual class feels like a more concrete representation to me - and shifting from one concrete representation to another feels less reasonable. This is a very touchy-feely explanation rather than a genuine technical limitation though.

How is Generic Covariance & Contra-variance Implemented in C# 4.0?

Variance will only be supported in a safe way - in fact, using the abilities that the CLR already has. So the examples I give in the book of trying to use a List<Banana> as a List<Fruit> (or whatever it was) still won't work - but a few other scenarios will.

Firstly, it will only be supported for interfaces and delegates.

Secondly, it requires the author of the interface/delegate to decorate the type parameters as in (for contravariance) or out (for covariance). The most obvious example is IEnumerable<T> which only ever lets you take values "out" of it - it doesn't let you add new ones. That will become IEnumerable<out T>. That doesn't hurt type safety at all, but lets you return an IEnumerable<string> from a method declared to return IEnumerable<object> for instance.

Contravariance is harder to give concrete examples for using interfaces, but it's easy with a delegate. Consider Action<T> - that just represents a method which takes a T parameter. It would be nice to be able to convert seamlessly use an Action<object> as an Action<string> - any method which takes an object parameter is going to be fine when it's presented with a string instead. Of course, C# 2 already has covariance and contravariance of delegates to some extent, but via an actual conversion from one delegate type to another (creating a new instance) - see P141-144 for examples. C# 4 will make this more generic, and (I believe) will avoid creating a new instance for the conversion. (It'll be a reference conversion instead.)

Hope this clears it up a bit - please let me know if it doesn't make sense!

Why doesn't C# support variant generic classes?

One reason would be:

class Foo<out T>
{
T _store;
public T Get()
{
_store = default(T);
return _store;
}
}

This class contains a feature that is not covariant, because it has a field, and fields can be set to values. It is though used in a covariant way, because it is only ever assigned the default value and that is only ever going to be null for any case where covariance is actually used.

As such it's not clear if we could allow it. Not allowing it would irritate users (it does after all match the same potential rules you suggest), but allowing it is difficult (the analysis has gotten slightly tricky already and we're not that even beginning to hunt for really tricky cases).

On the other hand, the analysis of this is much simpler:

void Main()
{
IFoo<object> foo = new Foo<string>();
Console.WriteLine(foo.Get());
}

interface IFoo<out T>
{
T Get();
}

class Foo<T> : IFoo<T>
{
T _store;
public T Get()
{
_store = default(T);
return _store;
}
}

It's easy to determine that none of the implementation of IFoo<T> breaks the covariance, because it hasn't got any. All that's necessary is to make sure that there is no use of T as a parameter (including that of a setter method) and it's done.

The fact that the potential restriction is a lot more arduous on a class than on an interface for similar reasons, also reduces the degree to which covariant classes would be useful. They certainly wouldn't be useless, but the balance of how useful they would be over how much work it would be to specify and implement the rules about what they would be allowed to do is much less than the balance of how useful covariant interfaces are over how over how much work it was to specify and implement them.

Certainly, the difference is enough that it's past the point of "well, if you're going to allow X it would be silly to not allow Y…".

Why covariance and contravariance do not support value type

Basically, variance applies when the CLR can ensure that it doesn't need to make any representational change to the values. References all look the same - so you can use an IEnumerable<string> as an IEnumerable<object> without any change in representation; the native code itself doesn't need to know what you're doing with the values at all, so long as the infrastructure has guaranteed that it will definitely be valid.

For value types, that doesn't work - to treat an IEnumerable<int> as an IEnumerable<object>, the code using the sequence would have to know whether to perform a boxing conversion or not.

You might want to read Eric Lippert's blog post on representation and identity for more on this topic in general.

EDIT: Having reread Eric's blog post myself, it's at least as much about identity as representation, although the two are linked. In particular:

This is why covariant and contravariant conversions of interface and delegate types require that all varying type arguments be of reference types. To ensure that a variant reference conversion is always identity-preserving, all of the conversions involving type arguments must also be identity-preserving. The easiest way to ensure that all the non-trivial conversions on type arguments are identity-preserving is to restrict them to be reference conversions.

still confused about covariance and contravariance & in/out

Both covariance and contravariance in C# 4.0 refer to the ability of using a derived class instead of base class. The in/out keywords are compiler hints to indicate whether or not the type parameters will be used for input and output.

Covariance

Covariance in C# 4.0 is aided by out keyword and it means that a generic type using a derived class of the out type parameter is OK. Hence

IEnumerable<Fruit> fruit = new List<Apple>();

Since Apple is a Fruit, List<Apple> can be safely used as IEnumerable<Fruit>

Contravariance

Contravariance is the in keyword and it denotes input types, usually in delegates. The principle is the same, it means that the delegate can accept more derived class.

public delegate void Func<in T>(T param);

This means that if we have a Func<Fruit>, it can be converted to Func<Apple>.

Func<Fruit> fruitFunc = (fruit)=>{};
Func<Apple> appleFunc = fruitFunc;

Why are they called co/contravariance if they are basically the same thing?

Because even though the principle is the same, safe casting from derived to base, when used on the input types, we can safely cast a less derived type (Func<Fruit>) to a more derived type (Func<Apple>), which makes sense, since any function that takes Fruit, can also take Apple.

Why is dynamic not covariant and contravariant with respect to all types when used as a generic type parameter?

I am wondering if dynamic is semantically equivalent to object when used as a generic type parameter.

Your conjecture is completely correct.

"dynamic" as a type is nothing more than "object" with a funny hat on, a hat that says "rather than doing static type checking for this expression of type object, generate code that does the type checking at runtime". In all other respects, dynamic is just object, end of story.

I am curious why this limitation exists since the two are different when assigning values to variables or formal parameters.

Think about it from the compiler's perspective and then from the IL verifier's perspective.

When you're assigning a value to a variable, the compiler basically says "I need to generate code that does an implicit conversion from a value of such and such a type to the exact type of the variable". The compiler generates code that does that, and the IL verifier verifies its correctness.

That is, the compiler generates:

Frob x = (Frob)whatever;

But limits the conversions to implicit conversions, not explicit conversions.

When the value is dynamic, the compiler basically says "I need to generate code that interrogates this object at runtime, determines its type, starts up the compiler again, and spits out a small chunk of IL that converts whatever this object is to the type of this variable, runs that code, and assigns the result to this variable. And if any of that fails, throw."

That is, the compiler generates the moral equivalent of:

Frob x = MakeMeAConversionFunctionAtRuntime<Frob>((object)whatever);

The verifier doesn't even blink at that. The verifier sees a method that returns a Frob. That method might throw an exception if it is unable to turn "whatever" into a Frob; either way, nothing but a Frob ever gets written into x.

Now think about your covariance situation. From the CLR's perspective, there is no such thing as "dynamic". Everywhere that you have a type argument that is "dynamic", the compiler simply generates "object" as a type argument. "dynamic" is a C# language feature, not a Common Language Runtime feature. If covariance or contravariance on "object" isn't legal, then it isn't legal on "dynamic" either. There's no IL that the compiler can generate to make the CLR's type system work differently.

This then explains why it is that you observe that there is a conversion from, say, List<dynamic> to and from List<object>; the compiler knows that they are the same type. The specification actually calls out that these two types have an identity conversion between them; they are identical types.

Does that all make sense? You seem very interested in the design principles that underly dynamic; rather than trying to deduce them from first principles and experiments yourself, you could save yourself the bother and read Chris Burrows' blog articles on the subject. He did most of the implementation and a fair amount of the design of the feature.

Co- and Contravariance bugs in .NET 4.0

Long story short: delegate combining is all messed up with respect to variance. We discovered this late in the cycle. We're working with the CLR team to see if we can come up with some way to make all the common scenarios work without breaking backwards compatibility, and so on, but whatever we come up with will probably not make it into the 4.0 release. Hopefully we'll get it all sorted out in some service pack. I apologize for the inconvenience.



Related Topics



Leave a reply



Submit