Difference between IEnumerable Count() and Length
By calling Count on IEnumerable<T>
I'm assuming you're referring to the extension method Count
on System.Linq.Enumerable
. Length
is not a method on IEnumerable<T>
but rather a property on array types in .Net such as int[]
.
The difference is performance. TheLength
property is guaranteed to be a O(1) operation. The complexity of the Count
extension method differs based on runtime type of the object. It will attempt to cast to several types which support O(1) length lookup like ICollection<T>
via a Count
property. If none are available then it will enumerate all items and count them which has a complexity of O(N).
For example
int[] list = CreateSomeList();
Console.WriteLine(list.Length); // O(1)
IEnumerable<int> e1 = list;
Console.WriteLine(e1.Count()); // O(1)
IEnumerable<int> e2 = list.Where(x => x <> 42);
Console.WriteLine(e2.Count()); // O(N)
The value e2
is implemented as a C# iterator which does not support O(1) counting and hence the method Count
must enumerate the entire collection to determine how long it is.
How should I get the length of an IEnumerable?
If you need to read the number of items in an IEnumerable<T>
you have to call the extension method Count
, which in general (look at Matthew comment) would internally iterate through the elements of the sequence and it will return you the number of items in the sequence. There isn't any other more immediate way.
If you know that your sequence is an array, you could cast it and read the number of items using the Length
property.
No, in later versions there isn't any such method.
For implementation details of Count
method, please have a look at here.
.NET array - difference between Length, Count() and Rank
Length
is the property of an array object and using it is the most effective way to determine the count of elements in the array (Array.Length in MSDN documentation).
Count()
is a LINQ extension method that does effectively the same. It applies to arrays because arrays are enumerable objects. It's preferred to use Length
, because Count()
is likely to be more expensive (see this question for further discussion and MSDN documentation on Count for reference).
Rank
is the property that returns the number of dimensions (a different thing entirely). When you declare an array int[,] myArray = new int[5,10];
, the Rank
of it will be 2, but it will hold a total of 50 elements (MSDN on Rank property).
What's the difference between String.Count and String.Length?
On the surface they would seem functionally identical, but the main difference is:
Length
is a property that is defined of strings and is the usual way to find the length of a string.Count()
is implemented as an extension method. That is, whatstring.Count()
really does is callEnumerable.Count(this IEnumerable<char>)
, aSystem.Linq
extension method, given thatstring
is really a sequence ofchar
s.
Performance concerns of LINQ enumerable methods notwithstanding, use Length
instead, as it's built right into strings.
IEnumerable.Count() or ToList().Count
You asked:
I wonder, what would be faster.
Whenever you ask that you should actually time it and find out.
I set out to test all of these variants of obtaining a count:
var enumerable = Enumerable.Range(0, 1000000);
var list = enumerable.ToList();
var methods = new Func<int>[]
{
() => list.Count,
() => enumerable.Count(),
() => list.Count(),
() => enumerable.ToList().Count(),
() => list.ToList().Count(),
() => enumerable.Select(x => x).Count(),
() => list.Select(x => x).Count(),
() => enumerable.Select(x => x).ToList().Count(),
() => list.Select(x => x).ToList().Count(),
() => enumerable.Where(x => x % 2 == 0).Count(),
() => list.Where(x => x % 2 == 0).Count(),
() => enumerable.Where(x => x % 2 == 0).ToList().Count(),
() => list.Where(x => x % 2 == 0).ToList().Count(),
};
My testing code explicitly runs each method 1,000 times, measures each execution time with a Stopwatch
, and ignores all results where garbage collection occurred. It then gets an average execution time per method.
var measurements =
methods
.Select((m, i) => i)
.ToDictionary(i => i, i => new List<double>());
for (var run = 0; run < 1000; run++)
{
for (var i = 0; i < methods.Length; i++)
{
var sw = Stopwatch.StartNew();
var gccc0 = GC.CollectionCount(0);
var r = methods[i]();
var gccc1 = GC.CollectionCount(0);
sw.Stop();
if (gccc1 == gccc0)
{
measurements[i].Add(sw.Elapsed.TotalMilliseconds);
}
}
}
var results =
measurements
.Select(x => new
{
index = x.Key,
count = x.Value.Count(),
average = x.Value.Average().ToString("0.000")
});
Here are the results (ordered from slowest to fastest):
+---------+-----------------------------------------------------------+
| average | method |
+---------+-----------------------------------------------------------+
| 14.879 | () => enumerable.Select(x => x).ToList().Count(), |
| 14.188 | () => list.Select(x => x).ToList().Count(), |
| 10.849 | () => enumerable.Where(x => x % 2 == 0).ToList().Count(), |
| 10.080 | () => enumerable.ToList().Count(), |
| 9.562 | () => enumerable.Select(x => x).Count(), |
| 8.799 | () => list.Where(x => x % 2 == 0).ToList().Count(), |
| 8.350 | () => enumerable.Where(x => x % 2 == 0).Count(), |
| 8.046 | () => list.Select(x => x).Count(), |
| 5.910 | () => list.Where(x => x % 2 == 0).Count(), |
| 4.085 | () => enumerable.Count(), |
| 1.133 | () => list.ToList().Count(), |
| 0.000 | () => list.Count, |
| 0.000 | () => list.Count(), |
+---------+-----------------------------------------------------------+
Two things come out that are significant here.
One, any method with a .ToList()
inline is significantly slower than the equivalent without it.
Two, LINQ operators take advantage of the underlying type of the enumerable, where possible, to short-cut computations. The enumerable.Count()
and list.Count()
methods show this.
There is no difference between the list.Count
and list.Count()
calls. So the key comparison is between the enumerable.Where(x => x % 2 == 0).Count()
and enumerable.Where(x => x % 2 == 0).ToList().Count()
calls. Since the latter contains an extra operation we would expect it to take longer. It's almost 2.5 milliseconds longer.
I don't know why you say that you're going to call the counting code twice, but if you do it is better to build the list. If not just do the plain .Count()
call after your query.
Differences between Array.Length and Array.Count()
array.Count()
is actually a call to the Enumerable.Count<T>(IEnumerable<T>)
extension method.
Since this method takes an IEnumerable<T>
(as opposed to ICollection<T>
, which has a Count
property), it needs to loop through the entire sequence to figure out how big it is.
However, it actually checks whether the parameter implements ICollection<T>
(which arrays do), and, if so, returns Count
directly.
Therefore, calling .Count()
on an array isn't much slower than .Length
, although it will involve an extra typecast.
When should I use .Count() and .Count in the context of an IEnumerableT
The extension method works on any IEnumerable<T>
but it is costly because it counts the sequence by iterating it. There is an optimization if the sequence is ICollection<T>
meaning that the length of the collection is known. Then the Count
property is used but that is an implementation detail.
The best advice is to use the Count
property if available for performance reasons.
Is .Count() predominately better saved for queryable collections that are yet to be executed, and therefore don't have an enumeration yet?
If your collection is IQueryable<T>
and not IEnumerable<T>
then the query provider may be able to return the count in some efficient maner. In that case you will not suffer a performance penalty but it depends on the query provider.
An IQueryable<T>
will not have a Count
property so there is no choice between using the extension method and the property. However, if you query provider does not provide an efficient way of computing Count()
you might consider using .ToList()
to pull the collection to the client side. It really depends on how you intend to use it.
count vs length vs size in a collection
Length()
tends to refer to contiguous elements - a string has a length for example.
Count()
tends to refer to the number of elements in a looser collection.
Size()
tends to refer to the size of the collection, often this can be different from the length in cases like vectors (or strings), there may be 10 characters in a string, but storage is reserved for 20. It also may refer to number of elements - check source/documentation.
Capacity()
- used to specifically refer to allocated space in collection and not number of valid elements in it. If type has both "capacity" and "size" defined then "size" usually refers to number of actual elements.
I think the main point is down to human language and idioms, the size of a string doesn't seem very obvious, whilst the length of a set is equally confusing even though they might be used to refer to the same thing (number of elements) in a collection of data.
Count property vs Count() method?
Decompiling the source for the Count()
extension method reveals that it tests whether the object is an ICollection
(generic or otherwise) and if so simply returns the underlying Count
property:
So, if your code accesses Count
instead of calling Count()
, you can bypass the type checking - a theoretical performance benefit but I doubt it would be a noticeable one!
// System.Linq.Enumerable
public static int Count<TSource>(this IEnumerable<TSource> source)
{
checked
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
ICollection<TSource> collection = source as ICollection<TSource>;
if (collection != null)
{
return collection.Count;
}
ICollection collection2 = source as ICollection;
if (collection2 != null)
{
return collection2.Count;
}
int num = 0;
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
while (enumerator.MoveNext())
{
num++;
}
}
return num;
}
}
Related Topics
What's the Best Way to Test SQL Server Connection Programmatically
How to Disable Alt + F4 Closing Form
Is Dbcontext the Same as Datacontext
Drawing on Top of Controls Inside a Panel (C# Winforms)
How Is Gethashcode() Implemented for Int32
Does the Order of Linq Functions Matter
Simplest Way to Run Three Methods in Parallel in C#
Syncing SQL Server 2008 Databases Over Http Using Wcf & Sync Framework
JSON Serializer Object with Internal Properties
Dispatcher Invoke(...) VS Begininvoke(...) Confusion
Convert String to System.Io.Stream
Read a Xml (From a String) and Get Some Fields - Problems Reading Xml
What Is the Meaning of "This" in C#
Working with C# Anonymous Types
Windows.Forms.Panel 32767 Size Limit
Only Primitive Types or Enumeration Types Are Supported in This Context