Why Is It Bad to Use an Iteration Variable in a Lambda Expression

Why is it bad to use an iteration variable in a lambda expression

Consider this code:

List<Action> actions = new List<Action>();

for (int i = 0; i < 10; i++)
{
actions.Add(() => Console.WriteLine(i));
}

foreach (Action action in actions)
{
action();
}

What would you expect this to print? The obvious answer is 0...9 - but actually it prints 10, ten times. It's because there's just one variable which is captured by all the delegates. It's this kind of behaviour which is unexpected.

EDIT: I've just seen that you're talking about VB.NET rather than C#. I believe VB.NET has even more complicated rules, due to the way variables maintain their values across iterations. This post by Jared Parsons gives some information about the kind of difficulties involved - although it's back from 2007, so the actual behaviour may have changed since then.

Can anyone explain to me why using iteration variable in a lambda expression is a bad idea

The problem occurs because lambda expressions do not execute when they are constructed but rather when they are invoked.

See the link below:
http://blogs.msdn.com/b/vbteam/archive/2007/07/26/closures-in-vb-part-5-looping.aspx

Hope it helps.

Warning: Using the iteration variable in a lambda expression may have unexpected results

As the message says, it "may have" undesired effects. In your case the .ToList() makes it safe but that is hard for the compiler to verify.

I would suggest to adopt copying to a local var (Dim exc = excel) as standard 'best practice'

Why do I get: iteration variable in a lambda expression may have unexpected results

The lambda is bound to the variable, not to the value the variable had when the lambda was turned into a delegate. As the loop variable updates, every lambda created bound to that variable also sees the changed value of the variable, which you might not want. Creating a new variable every time you iterate the loop binds each lambda over a new, different, unchanging variable.

This is a major pain point in C# and VB. In C# 5 and VB 11 we are changing the loop closure semantics to mitigate this problem.

For more information see

Is there a reason for C#'s reuse of the variable in a foreach?

and the last few paragraphs of Tim's article:

http://msdn.microsoft.com/en-us/magazine/cc163362.aspx

and my article:

http://ericlippert.com/2009/11/12/closing-over-the-loop-variable-considered-harmful-part-one/

How to suppress VB's Iteration variable shouldn't been used in lambda expression

In this particular case where the lambda is evaluated immediately, then you can safely eliminate the warning by moving the declaration of the iteration variable outside the for loop.

Dim i = 0
For i = 0 To 10
...

I do want to stress though that this only works if the lambda does not escape the for loop (true for your scenario).

Also here is a detailed link to an article I wrote on this warning (why it exists, how to avoid it, etc ...)

  • http://blogs.msdn.com/b/jaredpar/archive/2007/07/26/closures-in-vb-part-5-looping.aspx

Why should variables inside a forEach loop not be changed?

The Java Memory Model has very important property: it guarantees that local variables and method parameters are never writable by another thread. This adds much safety to multi-threading programming. However when you create a lambda (or an anonymous class), nobody knows how it will be used. It can be passed to another thread for execution (for example, if you use parallelStream().forEach(...)). Were it possible to modify the local variable that important property would be violated. Not the thing the Java language developers would sacrifice.

Usually when you are using lambdas, you are trying to program in functional way. But in functional programming mutable variables are considered bad practice: it's better to assign every variable only once. So trying to modify the local variable actually smells. Use various stream reduction methods instead of forEach to produce a good functional code.

Enhanced 'for' loop and lambda expressions

Lambda expressions work like callbacks. The moment they are passed in the code, they 'store' any external values (or references) they require to operate (as if these values were passed as arguments in a function call. This is just hidden from the developer). In your first example, you could work around the problem by storing k to a separate variable, like d:

for (int k = 0; k < 10; k++) {
final int d = k
new Thread(() -> System.out.println(d)).start();
}

Effectively final means, that in the above example, you can leave the 'final' keyword out, because d is effectively final, since it is never changed within its scope.

for loops operate differently. They are iterative code (as opposed to a callback). They work within their respective scope and can use all variables on their own stack. This means, that the for loop's code block is part of the external code block.

As to your highlighted question:

An enhanced for loop does not operate with a regular index-counter, at least not directly. Enhanced for loops (over non-arrays) create a hidden Iterator. You can test this the following way:

Collection<String> mySet = new HashSet<>();
mySet.addAll(Arrays.asList("A", "B", "C"));
for (String myString : mySet) {
if (myString.equals("B")) {
mySet.remove(myString);
}
}

The above example will cause a ConcurrentModificationException. This is due to the iterator noticing that the underlying collection has changed during the execution. However in your very example, the external loop creates an 'effectively final' variable arg which can be referenced within the lambda expression, because the value is captured at execution time.

The prevention of the capture of 'non-effectively-final' values is more or less just a precaution in Java, because in other languages (like JavaScript e.g.) this works differently.

So the compiler could theoretically translate your code, capture the value, and continue, but it would have to store that value differently, and you would probably get unexpected results. Therefore the team developing lambdas for Java 8 correctly excluded this scenario, by preventing it with an exception.

If you ever need to change values of external variables within lambda expressions, you can either declare a one-element array:

String[] myStringRef = { "before" };
someCallingMethod(() -> myStringRef[0] = "after" );
System.out.println(myStringRef[0]);

Or use AtomicReference<T> to make it thread-safe. However with your example, this would probably return "before" since the callback would most likely be executed after the execution of println.

Which would be better in terms of performance Lambda or simple loop?

My advice would be:

  1. Use the style that you and your coworkers agree is most maintainable.

  2. If you and your colleagues are not yet comfortable with lambdas, keep learning.

  3. Don't obsess over performance. It is often not the most important thing.

Generally speaking, lambdas and streams provide a more concise and (once everyone is up to speed) more readable way of expressing this kind of algorithm. Performance is not the primary goal.

If performance does become an issue, then the standard advice is to code, test, benchmark, profile and optimize. And do it in that order! You can easily waste a lot time by optimizing at the coding stage, or by optimizing code that has minimal impact on overall application performance.

  • Let the application benchmarks tell you if you need to optimize at all.
  • Let the profiler point out the parts of your code that are worthy of the effort of optimization.

In this specific example, the performance difference is going to be too small to measure. And if you scaled up to a list of millions of elements, the performance will be dominated by the time taken to build the list and write the numbers. The different ways of iteration will only contribute a small part to the overall performance.


And for folks, who (despite all of the above) still want to know whether it is faster to use a lambda or a conventional loop, the best general answer is:

"It depends on all sorts of factors that 1) are not well understood, and 2) liable to change as Java compiler technology evolves.

We could give you an answer for a specific example with a specific Java major/minor/patch release, but it would be unwise to generalize.

C# - For loop and the lambda expressions

It is hard to tell from your example, because it seems to be perfectly formed and correctly capturing. Here is my expanded version, and it works fine:

    public void Main()
{
List<string> list = new List<string>(3)
{ "a", "b", "c"};
for (int i = 0; i < list.Count; i++)
{
int yy = i;
AFunctionWithLambda(() => Console.WriteLine(list[yy]));
}

Thread.Sleep(1000);
Console.WriteLine("all done, probably");
Console.ReadLine();
}

private void AFunctionWithLambda(Action action)
{ // runs it asynchronously, for giggles
ThreadPool.QueueUserWorkItem(o => {
Thread.Sleep(500); // delay, to let the loop finish
action();
});
}

If you change it to:

        for (int i = 0; i < list.Count; i++)
{
AFunctionWithLambda(() => Console.WriteLine(list[i]));
}

then it fails in the way described separately.

We will need a better example ;p



Related Topics



Leave a reply



Submit