For VS. Linq - Performance VS. Future

For vs. Linq - Performance vs. Future

The best practice depends on what you need:

  1. Development speed and maintainability: LINQ
  2. Performance (according to profiling tools): manual code

LINQ really does slow things down with all the indirection. Don't worry about it as 99% of your code does not impact end user performance.

I started with C++ and really learnt how to optimize a piece of code. LINQ is not suited to get the most out of your CPU. So if you measure a LINQ query to be a problem just ditch it. But only then.

For your code sample I'd estimate a 3x slowdown. The allocations (and subsequent GC!) and indirections through the lambdas really hurt.

Where does this LINQ performance come from?

Here are some reasons there's a difference between non-LINQ and LINQ code performance:

  1. Every call to a method has some performance overhead. Information has to be pushed onto the stack, the CPU has to jump to a different instruction line, etc. In the LINQ version you are calling into Select and FirstOrDefault, which you are not doing in the non-LINQ version.
  2. There is overhead, both time and memory, when you create a Func<> to pass into the Select method. The memory overhead, when multiplied many times as you do in your benchmark, can lead to the need to run the garbage collector more often, which can be slow.
  3. The Select LINQ method you are calling produces an object that represents its return value. This also adds a little memory consumption.

why is the difference so large?

It's actually not that large. True, LINQ takes 50% longer, but honestly you're talking about only being able to complete this entire recursive operation 400 times in a millisecond. That's not slow, and you're unlikely to ever notice the difference unless this is an operation you're doing all the time in a high-performance application like a video game.

LINQ Where vs For Loop implementation

Not much of an answer, but since I played with it a little I may as well share.

I did not spend much time looking at the GroupBy comparison because the types used are different enough that they may be the bottleneck, and I'm not familiar enough with IGrouping to create a new test right now.

I found that if you use the List.Count property instead of the List.Count() extension method, it saved enough time (iterating over 1000000 items) to make the manual code faster than Linq. Additionally, a few more milliseconds were saved by removing the assignment var student = students[i];:

public class Student { public string Name { get; set; } public int Age { get; set; } }

public class Program
{
public static List<Student> Students = new List<Student>();

public static void CreateStudents()
{
for (var i = 0; i < 1000000; i++)
{
Students.Add(new Student {Name = $"Student{i}", Age = i});
}
}

public static List<Student> WhereManualOriginal(List<Student> students)
{
var filteredList = new List<Student>();

for (var i = 0; i < students.Count(); i++)
{
var student = students[i];

if (student.Age == 32)
{
filteredList.Add(student);
}
}

return filteredList;
}

public static List<Student> WhereManualNew(List<Student> students)
{
var filteredList = new List<Student>();

for (var i = 0; i < students.Count; i++)
{
if (students[i].Age == 32)
{
filteredList.Add(students[i]);
}
}

return filteredList;
}

public static long LinqWhere()
{
var sw = Stopwatch.StartNew();
var items = Students.Where(s => s.Age == 32);
foreach (var item in items) { }
sw.Stop();
return sw.ElapsedTicks;
}

public static long ManualWhere()
{
var sw = Stopwatch.StartNew();
var items = WhereManualOriginal(Students);
foreach (var item in items) { }
sw.Stop();
return sw.ElapsedTicks;
}

public static long NewManualWhere()
{
var sw = Stopwatch.StartNew();
var items = WhereManualNew(Students);
foreach (var item in items) { }
sw.Stop();
return sw.ElapsedTicks;
}

public static void Main()
{
// Warmup stuff
CreateStudents();
WhereManualOriginal(Students);
WhereManualNew(Students);
Students.Where(s => s.Age == 32).ToList();
var linqResults = new List<long>();
var manualResults = new List<long>();
var newManualResults = new List<long>();

for (int i = 0; i < 100; i++)
{
newManualResults.Add(NewManualWhere());
manualResults.Add(ManualWhere());
linqResults.Add(LinqWhere());
}

Console.WriteLine("Linq where ......... " + linqResults.Average());
Console.WriteLine("Manual where ....... " + manualResults.Average());
Console.WriteLine("New Manual where ... " + newManualResults.Average());

GetKeyFromUser("\nDone! Press any key to exit...");
}
}

Output

Sample Image

Linq First or Single have drastic effect on performance

In the first code example you are not executing query, only creating it.

First(), Single(), ToArray() and some other methods triggering query execution / enumeration.

Which is faster? Loop over object properties or linq query?

Technically, the first case may allocate more memory and do more processing to generate the final result than the second case because of the intermediate data and LINQ abstractions. But the amount of time and memory is so negligible in the grand scope of things, that you're way better off making your code the most readable than the most efficient for this scenario. It's probably a case of premature optimization.

Here are some references why the first may be slightly slower:

  1. http://www.schnieds.com/2009/03/linq-vs-foreach-vs-for-loop-performance.html
  2. http://ox.no/posts/linq-vs-loop-a-performance-test
  3. http://geekswithblogs.net/BlackRabbitCoder/archive/2010/04/23/c-linq-vs-foreach---round-1.aspx

In .NET, which loop runs faster, 'for' or 'foreach'?

Patrick Smacchia blogged about this last month, with the following conclusions:

  • for loops on List are a bit more than 2 times cheaper than foreach
    loops on List.
  • Looping on array is around 2 times cheaper than looping on List.
  • As a consequence, looping on array using for is 5 times cheaper
    than looping on List using foreach
    (which I believe, is what we all do).


Related Topics



Leave a reply



Submit