Why Does the Linq Cast<> Helper Not Work with the Implicit Cast Operator

Why does the Linq Cast helper not work with the implicit cast operator?

The short answer would be simply: the Cast<T> method doesn't support custom conversion operators.

In the first example:

B b = a;
B b2 = (B)a;

the compiler can see this B(A a) operator during static analysis; the compiler interprets this as a static call to your custom operator method. In the second example:

foreach (object obj in source) 
yield return (T)obj;

that has no knowledge of the operator; this is implemented via unbox.any (which is the same as castclass if T is a ref-type).

There is also a third option: if you went via dynamic, the runtime implementation tries to mimic compiler rules, so this will find the operator ... but not as part of the C#-to-IL compile step:

dynamic b = a; // note that `dynamic` here is *almost* the same as `object`
B b2 = b;

Why does a Linq CastT operation fail when I have an implicit cast defined?

Because, looking at the code via Reflector, Cast doesnt attempt to take any implicit cast operators (the LINQ Cast code is heavily optimised for special cases of all kinds, but nothing in that direction) into account (as many .NET languages won't).

Without getting into reflection and other things, generics doesnt offer any out of the box way to take such extra stuff into account in any case.

EDIT: In general, more complex facilities like implicit/explict, equality operators etc. are not generally handled by generic facilities like LINQ.

Implicit Conversion over a Collection

A question much like this gets asked almost every day on SO. You can't do this because doing so violates type safety:

List<Giraffe> g = new List<Giraffe>();
List<Animal> a = g; // Should this be legal?
a.Add(new Tiger()); // Nope; we just added a tiger to a list of giraffes.

In C# 4.0 you can implicitly convert from IEnumerable<Giraffe> to IEnumerable<Animal> because there is no "Add" method to screw things up. But you can never do a "covariant" conversion like that if the conversion of the element types is user-defined. It has to be a reference or identity conversion.

You'll need to create a second list and copy them over one at a time. Or use the LINQ helper methods like Select and ToList to do that work for you.

The name of the type system concept you want is "covariance"; a covariant relationship is one where you reason "Giraffes are animals therefore sequences of giraffes are sequences of animals". If this subject interests you then you can read about how we added covariance (and contravariance) to C# 4.0 here:

http://blogs.msdn.com/b/ericlippert/archive/tags/covariance+and+contravariance/default.aspx

Start from the bottom.

What is the difference between directly casting an array or using System.Linq.Cast?

Your two examples, while different, are both invalid.

You cannot cast an array of one object type to another, even if there exists a conversion operator between them (explicit or implicit). The compiler rightly prevents such a cast. The exception to this rule is if there exists an inheritance relationship; thanks to array covariance you can downcast to a base type (for reference types). The following works:

class A {} 
class B : A {}

B[] bs = new[] { new B() };
A[] result = (A[])bs; // valid

See SharpLab

The same principles hold true for the Cast<T> method in LINQ--unless the types match, an exception will be thrown at runtime upon enumeration. The answer below is incorrect. You cannot, for example, Cast an array of double to int. Of course, if you don't enumerate the result (such as in the example) then no exception occurs. However upon actually enumerating (foreach, ToList, ToArray) an InvalidCastException will be thrown.

var array = new double[2];

array[0] = 10;
array[1] = 20;

var temp = array.Cast<int>(); // OK, not enumerated
var converted = temp.ToList(); // bam! InvalidCastException

Notice the temp variable--as in the answer below it doesn't throw thanks to LINQ's deferred execution. It's once you enumerate it that it fails. See SharpLab.

The Cast method was designed to bridge the gap with pre-generic collections where values were stored internally as an array of object and the collections themselves only implement IEnumerable. Cast allows one to convert to IEnumerable<T>, however no casting/converting other than from object to the original type is allowed.

For structs this is obvious--a boxed double can only be unboxed to a double; it cannot be unboxed to an int. Take the simple, non-array case:

double d = 1.5;
object o = d;
int iOk = (int)(double)o; // ok
int iBad = (int)o; // fails

See SharpLab

It makes sense then, that Cast<int> will fail as the method only inserts the single cast to int, and not the intermediate cast to double that would otherwise be required.

For classes, again Cast will only insert the direct cast. The method is generic and does not/cannot account for any user defined operators. So when you say you "have two classes that can be cast to each other" this still would not matter. In other words, the following will fail:

class A {} 
class B {
public static implicit operator A(B b) => new A();
}

B[] bs = new[] { new B() };
var temp = bs.Cast<A>(); // OK, not yet enumerated
A[] result = temp.ToArray(); // throws InvalidCastException

See SharpLab

Again (as above), the exception to this rule is if there exists an inheritance relationship between the two classes. You can downcast from one to the other:

class A {} 
class B : A {}

B[] bs = new[] { new B() };
A[] result = bs.Cast<A>().ToArray(); // valid

See SharpLab

One alternative is to use LINQ's Select to project your original collection, applying the conversion operators you desire:

class A {} 
class B {
public static implicit operator A(B b) => new A();
}

B[] bs = new[] { new B() };
A[] result = bs.Select(b => (A)b).ToArray(); // valid!

See SharpLab. This would also work in the case of the double/int:

var array = new double[] { 10.2, 20.4 };
int[] result = array.Select(d => (int)d).ToArray();

See SharpLab

Shorter syntax for casting from a ListX to a ListY?

If X can really be cast to Y you should be able to use

List<Y> listOfY = listOfX.Cast<Y>().ToList();

Some things to be aware of (H/T to commenters!)

  • You must include using System.Linq; to get this extension method
  • This casts each item in the list - not the list itself. A new List<Y> will be created by the call to ToList().
  • This method does not support custom conversion operators. ( see Why does the Linq Cast<> helper not work with the implicit cast operator? )
  • This method does not work for an object that has an explicit operator method (framework 4.0)

C#: implicit operator and extension methods

This is not specific to extension methods. C# won't implicitly cast an object to another type unless there is a clue about the target type. Assume the following:

class A {
public static implicit operator B(A obj) { ... }
public static implicit operator C(A obj) { ... }
}

class B {
public void Foo() { ... }
}

class C {
public void Foo() { ... }
}

Which method would you expect to be called in the following statement?

new A().Foo(); // B.Foo? C.Foo? 

Ienumerable.cast to ValueTuple InvalidCastException

When you are defining your operators, you are defining custom conversion operators, not cast operators. Custom conversion operators are only applied at compile time from one static type to another (which generally means they can't be used in generic methods). The "Cast" enumerable method is only applying cast operations (as the name implies).

It's not possible to create custom casting operators (casting is a thing only the language/runtime can mess around with).

If you change points.Cast<(double x, double y)>() to points.Select>(p => ((double x, double y))p), that will instead invoke your custom cast operator (though it looks pretty crazy with all those parenthesis, but the are needed to convert to the named tuple).

Casting vs using the 'as' keyword in the CLR

The answer below the line was written in 2008.

C# 7 introduced pattern matching, which has largely replaced the as operator, as you can now write:

if (randomObject is TargetType tt)
{
// Use tt here
}

Note that tt is still in scope after this, but not definitely assigned. (It is definitely assigned within the if body.) That's slightly annoying in some cases, so if you really care about introducing the smallest number of variables possible in every scope, you might still want to use is followed by a cast.


I don't think any of the answers so far (at the time of starting this answer!) have really explained where it's worth using which.

  • Don't do this:

    // Bad code - checks type twice for no reason
    if (randomObject is TargetType)
    {
    TargetType foo = (TargetType) randomObject;
    // Do something with foo
    }

    Not only is this checking twice, but it may be checking different things, if randomObject is a field rather than a local variable. It's possible for the "if" to pass but then the cast to fail, if another thread changes the value of randomObject between the two.

  • If randomObject really should be an instance of TargetType, i.e. if it's not, that means there's a bug, then casting is the right solution. That throws an exception immediately, which means that no more work is done under incorrect assumptions, and the exception correctly shows the type of bug.

    // This will throw an exception if randomObject is non-null and
    // refers to an object of an incompatible type. The cast is
    // the best code if that's the behaviour you want.
    TargetType convertedRandomObject = (TargetType) randomObject;
  • If randomObject might be an instance of TargetType and TargetType is a reference type, then use code like this:

    TargetType convertedRandomObject = randomObject as TargetType;
    if (convertedRandomObject != null)
    {
    // Do stuff with convertedRandomObject
    }
  • If randomObject might be an instance of TargetType and TargetType is a value type, then we can't use as with TargetType itself, but we can use a nullable type:

    TargetType? convertedRandomObject = randomObject as TargetType?;
    if (convertedRandomObject != null)
    {
    // Do stuff with convertedRandomObject.Value
    }

    (Note: currently this is actually slower than is + cast. I think it's more elegant and consistent, but there we go.)

  • If you really don't need the converted value, but you just need to know whether it is an instance of TargetType, then the is operator is your friend. In this case it doesn't matter whether TargetType is a reference type or a value type.

  • There may be other cases involving generics where is is useful (because you may not know whether T is a reference type or not, so you can't use as) but they're relatively obscure.

  • I've almost certainly used is for the value type case before now, not having thought of using a nullable type and as together :)


EDIT: Note that none of the above talks about performance, other than the value type case, where I've noted that unboxing to a nullable value type is actually slower - but consistent.

As per naasking's answer, is-and-cast or is-and-as are both as fast as as-and-null-check with modern JITs, as shown by the code below:

using System;
using System.Diagnostics;
using System.Linq;

class Test
{
const int Size = 30000000;

static void Main()
{
object[] values = new object[Size];
for (int i = 0; i < Size - 2; i += 3)
{
values[i] = null;
values[i + 1] = "x";
values[i + 2] = new object();
}
FindLengthWithIsAndCast(values);
FindLengthWithIsAndAs(values);
FindLengthWithAsAndNullCheck(values);
}

static void FindLengthWithIsAndCast(object[] values)
{
Stopwatch sw = Stopwatch.StartNew();
int len = 0;
foreach (object o in values)
{
if (o is string)
{
string a = (string) o;
len += a.Length;
}
}
sw.Stop();
Console.WriteLine("Is and Cast: {0} : {1}", len,
(long)sw.ElapsedMilliseconds);
}

static void FindLengthWithIsAndAs(object[] values)
{
Stopwatch sw = Stopwatch.StartNew();
int len = 0;
foreach (object o in values)
{
if (o is string)
{
string a = o as string;
len += a.Length;
}
}
sw.Stop();
Console.WriteLine("Is and As: {0} : {1}", len,
(long)sw.ElapsedMilliseconds);
}

static void FindLengthWithAsAndNullCheck(object[] values)
{
Stopwatch sw = Stopwatch.StartNew();
int len = 0;
foreach (object o in values)
{
string a = o as string;
if (a != null)
{
len += a.Length;
}
}
sw.Stop();
Console.WriteLine("As and null check: {0} : {1}", len,
(long)sw.ElapsedMilliseconds);
}
}

On my laptop, these all execute in about 60ms. Two things to note:

  • There's no significant difference between them. (In fact, there are situations in which the as-plus-null-check definitely is slower. The above code actually makes the type check easy because it's for a sealed class; if you're checking for an interface, the balance tips slightly in favour of as-plus-null-check.)
  • They're all insanely fast. This simply will not be the bottleneck in your code unless you really aren't going to do anything with the values afterwards.

So let's not worry about the performance. Let's worry about correctness and consistency.

I maintain that is-and-cast (or is-and-as) are both unsafe when dealing with variables, as the type of the value it refers to may change due to another thread between the test and the cast. That would be a pretty rare situation - but I'd rather have a convention which I can use consistently.

I also maintain that the as-then-null-check gives a better separation of concerns. We have one statement which attempts a conversion, and then one statement which uses the result. The is-and-cast or is-and-as performs a test and then another attempt to convert the value.

To put it another way, would anyone ever write:

int value;
if (int.TryParse(text, out value))
{
value = int.Parse(text);
// Use value
}

That's sort of what is-and-cast is doing - although obviously in a rather cheaper way.

Static implicit operator

This is a conversion operator. It means that you can write this code:

XmlBase myBase = new XmlBase();
XElement myElement = myBase;

And the compiler won't complain! At runtime, the conversion operator will be executed - passing myBase in as the argument, and returning a valid XElement as the result.

It's a way for you as a developer to tell the compiler:

"even though these look like two totally unrelated types, there is actually a way to convert from one to the other; just let me handle the logic for how to do it."



Related Topics



Leave a reply



Submit