When to Use Cast() and Oftype() in Linq

When to use Cast() and Oftype() in Linq

OfType - return only the elements that can safely be cast to type x.

Cast - will try to cast all the elements into type x. if some of them are not from this type you will get InvalidCastException

EDIT

for example:

object[] objs = new object[] { "12345", 12 };
objs.Cast<string>().ToArray(); //throws InvalidCastException
objs.OfType<string>().ToArray(); //return { "12345" }

Why is OfType faster than Cast ?

My benchmarking does not agree with your benchmarking.

I ran an identical benchmark to Alex's and got the opposite result. I then tweaked the benchmark somewhat and again observed Cast being faster than OfType.

There's not much in it, but I believe that Cast does have the edge, as it should because its iterator is simpler. (No is check.)

Edit: Actually after some further tweaking I managed to get Cast to be 50x faster than OfType.

Below is the code of the benchmark that gives the biggest discrepancy I've found so far:

Stopwatch sw1 = new Stopwatch();
Stopwatch sw2 = new Stopwatch();

var ma = Enumerable.Range(1, 100000).Select(i => i.ToString()).ToArray();

var x = ma.OfType<string>().ToArray();
var y = ma.Cast<string>().ToArray();

for (int i = 0; i < 1000; i++)
{
if (i%2 == 0)
{
sw1.Start();
var arr = ma.OfType<string>().ToArray();
sw1.Stop();
sw2.Start();
var arr2 = ma.Cast<string>().ToArray();
sw2.Stop();
}
else
{
sw2.Start();
var arr2 = ma.Cast<string>().ToArray();
sw2.Stop();
sw1.Start();
var arr = ma.OfType<string>().ToArray();
sw1.Stop();
}
}
Console.WriteLine("OfType: " + sw1.ElapsedMilliseconds.ToString());
Console.WriteLine("Cast: " + sw2.ElapsedMilliseconds.ToString());
Console.ReadLine();

Tweaks I've made:

  • Perform the "generate a list of strings" work once, at the start, and "crystallize" it.
  • Perform one of each operation before starting timing - I'm not sure if this is necessary but I think it means the JITter generates code beforehand rather than while we're timing?
  • Perform each operation multiple times, not just once.
  • Alternate the order in case this makes a difference.

On my machine this results in ~350ms for Cast and ~18000ms for OfType.

I think the biggest difference is that we're no longer timing how long MatchCollection takes to find the next match. (Or, in my code, how long int.ToString() takes.) This drastically reduces the signal-to-noise ratio.

Edit: As sixlettervariables pointed out, the reason for this massive difference is that Cast will short-circuit and not bother casting individual items if it can cast the whole IEnumerable. When I switched from using Regex.Matches to an array to avoid measuring the regex processing time, I also switched to using something castable to IEnumerable<string> and thus activated this short-circuiting. When I altered my benchmark to disable this short-circuiting, I get a slight advantage to Cast rather than a massive one.

Only able to call Cast and OfType on IEnumerable

As you correctly conjectured, the problem arises because of ambiguities discovered during type inference. When you say:

class Ark : IEnumerable<Turtle>, IEnumerable<Giraffe> 
{ ... }

and then you say

Ark ark = whatever;
ark.Select(x => whatever);

somehow the compiler has to know whether you meant a.Select<Turtle, Result> or a.Select<Giraffe, Result>. Under no circumstances will C# try to guess that you meant a.Select<Animal, Result> or a.Select<object, Result> because that was not one of the choices provided. C# only makes choices from available types, and the available types are IEnumerable<Turtle> and IEnumerable<Giraffe>.

If there is no basis upon which to make a decision then type inference fails. Since we only got to extension methods after all other overload resolution attempts failed, we're probably going to fail overload resolution at this point.

There are many ways to make this work, but all of them involve somehow giving C# a hint about what you meant. The easiest way is

ark.Select<Turtle, Result>(x => whatever);

But you can also do

ark.Select((Turtle x) => whatever);

Or

((IEnumerable<Turtle>)ark).Select(x => whatever);

Or

(ark as IEnumerable<Turtle>).Select(x => whatever);

Those are all good. You deduced that this compiles:

ark.Cast<Turtle>().Select(x => whatever); 
// NEVER DO THIS IN THIS SCENARIO
// USE ANY OF THE OTHER TECHNIQUES, NEVER THIS ONE

Do you see why it is dangerous? Do not proceed until you understand why this is probably wrong. Reason it through.


In general, it's a dangerous practice to implement a type that implements two of "the same" generic interfaces because very weird things can happen. The language and the runtime were not designed to handle this sort of unification elegantly. Consider for example how covariance works; what happens if we cast ark to IEnumerable<Animal>? See if you can figure it out; then try it and see if you were right.

Unfortunately, you are in an even worse position; what if you instantiate GenericCollection<TCollection, TItem> such that TItem is GeneralObject? Now you have implemented IEnumerable<GeneralObject> twice! That's really confusing to users and the CLR does not like that at all.

The better practice is to make Ark implement neither interface, but rather expose two methods, one which returns turtles and one which returns giraffes. You should strongly consider doing the same in your class. A better design would be to make GenericCollection<TCollection, TItem> not implement IEnumerable<TItem> but rather to have a property IEnumerable<TItem> Items { get { ... } }.

With that design, you can then do collection.Select to get general objects, or collection.Items.Select to get items, and the problem goes away.

How to cast C#'s linq WHERE statement?

You would be better off doing:

literals.OfType<Tag>().ToList();

This gives you a List<Tag>.

You can also do:

var asList = new List<Tag>(literals.OfType<Tag>());

Casting simply does not work because LINQ works in terms of either IEnumerable<T> or IQueryable<T> which neither use List as a backing implementation for the results. The second method I posted uses a constructor overload of List<T> the takes in an IEnumerable<T> as its initial collection of objects. Also in this scenario the OfType<T> method from LINQ is a much cleaner, shorter form of essentially filtering a list with Where(x -> x is T).

Also, OfType<T> in this scenario is a much better idea, because the result is an IEnumerable<T> of your target type. Where(x => x is T) will return an IEnumerable<T> of the original source's type. So that's why (List<Tag>)literals.Where(x => x is Tag).ToList() emit an error for invalid casts.

More information on ToList

More information on OfType

What is the difference between directly casting an array or using System.Linq.Cast?

Your two examples, while different, are both invalid.

You cannot cast an array of one object type to another, even if there exists a conversion operator between them (explicit or implicit). The compiler rightly prevents such a cast. The exception to this rule is if there exists an inheritance relationship; thanks to array covariance you can downcast to a base type (for reference types). The following works:

class A {} 
class B : A {}

B[] bs = new[] { new B() };
A[] result = (A[])bs; // valid

See SharpLab

The same principles hold true for the Cast<T> method in LINQ--unless the types match, an exception will be thrown at runtime upon enumeration. The answer below is incorrect. You cannot, for example, Cast an array of double to int. Of course, if you don't enumerate the result (such as in the example) then no exception occurs. However upon actually enumerating (foreach, ToList, ToArray) an InvalidCastException will be thrown.

var array = new double[2];

array[0] = 10;
array[1] = 20;

var temp = array.Cast<int>(); // OK, not enumerated
var converted = temp.ToList(); // bam! InvalidCastException

Notice the temp variable--as in the answer below it doesn't throw thanks to LINQ's deferred execution. It's once you enumerate it that it fails. See SharpLab.

The Cast method was designed to bridge the gap with pre-generic collections where values were stored internally as an array of object and the collections themselves only implement IEnumerable. Cast allows one to convert to IEnumerable<T>, however no casting/converting other than from object to the original type is allowed.

For structs this is obvious--a boxed double can only be unboxed to a double; it cannot be unboxed to an int. Take the simple, non-array case:

double d = 1.5;
object o = d;
int iOk = (int)(double)o; // ok
int iBad = (int)o; // fails

See SharpLab

It makes sense then, that Cast<int> will fail as the method only inserts the single cast to int, and not the intermediate cast to double that would otherwise be required.

For classes, again Cast will only insert the direct cast. The method is generic and does not/cannot account for any user defined operators. So when you say you "have two classes that can be cast to each other" this still would not matter. In other words, the following will fail:

class A {} 
class B {
public static implicit operator A(B b) => new A();
}

B[] bs = new[] { new B() };
var temp = bs.Cast<A>(); // OK, not yet enumerated
A[] result = temp.ToArray(); // throws InvalidCastException

See SharpLab

Again (as above), the exception to this rule is if there exists an inheritance relationship between the two classes. You can downcast from one to the other:

class A {} 
class B : A {}

B[] bs = new[] { new B() };
A[] result = bs.Cast<A>().ToArray(); // valid

See SharpLab

One alternative is to use LINQ's Select to project your original collection, applying the conversion operators you desire:

class A {} 
class B {
public static implicit operator A(B b) => new A();
}

B[] bs = new[] { new B() };
A[] result = bs.Select(b => (A)b).ToArray(); // valid!

See SharpLab. This would also work in the case of the double/int:

var array = new double[] { 10.2, 20.4 };
int[] result = array.Select(d => (int)d).ToArray();

See SharpLab

Linq query and casting in c#

You need to use OfType():

public List<DateViewModel> DateCustomViewModels
{
get
{
return CustomFieldViewModels.OfType<DateViewModel>().ToList()
}
}

From what I understand, the .Where will return a List of whatever I
asked, with a filter (here, my enum)

No, it will return IEnumerable<BaseViewModel>. The criteria you specify in the Where() doesn't change the return type, it only specifies which of the BaseViewModel objects will be included.

I don't understand why there is trouble casting since all my childs
inherit from the parent class, and I'm not using a child-specific
element in my filtering.

Even though DateViewModel inherits from BaseViewModel, you cannot explicitly cast from List<DateViewModel> to List<BaseViewModel> because List<T> is invariant.

Also, even if the type of the main list is of Base, none of its
elements are of the Base type, so there shouldn't be any casting to do
in the first place.

You're right, there's no casting needed. Use OfType<DateViewModel>() that will only return the objects that are DateViewModel. Also, the returned set is now IEnumerable<DateViewModel> (it's no longer List<BaseViewModel>) and the compiler can verify that it's compatible with the returned type of the DateCustomViewModels property.

See MSDN

Difference between OfType () and checking type in Where() extension

Let us compare three methods (pay attention to generic arguments):

  1. listOfItems.Where(t => t is T) called on IEnumerable<X> will still return IEnumerable<X> just filtered to contain only elements of the type T.

  2. listOfItems.OfType<T>() called on IEnumerable<X> will return IEnumerable<T> containing elements that can be casted to type T.

  3. listOfItems.Cast<T>() called on IEnumerable<X> will return IEnumerable<T> containing elements casted to type T or throw an exception if any of the elements cannot be converted.

And listOfItems.Where(d => d is T).Cast<T>() is basically doing the same thing twice - Where filters all elements that are T but still leaving the type IEnumerable<X> and then Cast again tries to cast them to T but this time returning IEumerable<T>.

C# Cast object to IEnumerable T and use Linq extensions when the enumerable type is a value type e.g. enum

The array StringComparison[] implements the IEnumerable and IEnumerable<StringComparison> interfaces, but does not implement IEnumerable<object>.

Unfortunately, you can't use generic covariance to let you assign an object of type IEnumerable<StringComparison> to a variable of type IEnumerable<object>, as StringComparison isn't a reference type: you can't just access any old element of the StringComparison[] as if it was an object, because it doesn't have the normal object header etc.

What you can do is:

IEnumerable<object> items = ((IEnumerable)value).Cast<object>();

This works because arrays implement IEnumerable, and Enumerable.Cast is an extension method on IEnumerable, not IEnumerable<T>.

Every time you fetch an element from items, the Cast<object>() method gets involved and boxes the StringComparison value type into an object.

This means that iterating the list multiple times (as calling lots of linq methods will do) could be expensive, as it will allocate new objects each time. You might want to cache the boxed objects into a list to avoid this cost:

List<object> items = ((IEnumerable)value).Cast<object>().ToList();


Related Topics



Leave a reply



Submit