Using the Iterator Variable of Foreach Loop in a Lambda Expression - Why Fails

Using the iterator variable of foreach loop in a lambda expression - why fails?

Welcome to the world of closures and captured variables :)

Eric Lippert has an in-depth explanation of this behaviour:

  • Closing over the loop variable considered harmful
  • Closing over the loop variable, part two

basically, it's the loop variable that is captured, not it's value.
To get what you think you should get, do this:

foreach (var type in types)
{
var newType = type;
var sayHello =
new PrintHelloType(greeting => SayGreetingToType(newType, greeting));
helloMethods.Add(sayHello);
}

Why is it bad to use an iteration variable in a lambda expression

Consider this code:

List<Action> actions = new List<Action>();

for (int i = 0; i < 10; i++)
{
actions.Add(() => Console.WriteLine(i));
}

foreach (Action action in actions)
{
action();
}

What would you expect this to print? The obvious answer is 0...9 - but actually it prints 10, ten times. It's because there's just one variable which is captured by all the delegates. It's this kind of behaviour which is unexpected.

EDIT: I've just seen that you're talking about VB.NET rather than C#. I believe VB.NET has even more complicated rules, due to the way variables maintain their values across iterations. This post by Jared Parsons gives some information about the kind of difficulties involved - although it's back from 2007, so the actual behaviour may have changed since then.

In Java, why can't I use a lambda as an enhanced for loop's Expression?

This is not just about lambda expression; it's about all poly expressions that require target typing.

One thing for sure is that this is not an oversight; the case was considered and rejected.

To quote an early spec :

http://cr.openjdk.java.net/~dlsmith/jsr335-0.9.3/D.html

Deciding what contexts are allowed to support poly expressions is driven in large part by the practical need for such features:

The expression in an enhanced for loop is not in a poly context because, as the construct is currently defined, it is as if the expression were a receiver: exp.iterator() (or, in the array case, exp[i]). It is plausible that an Iterator could be wrapped as an Iterable in a for loop via a lambda expression (for (String s : () -> stringIterator)), but this doesn't mesh very well with the semantics of Iterable.

My take is that, each invocation of Iterable.iterator() must return a new, independent iterator, positioned at the beginning. Yet, the lambda expression in the example (and in your example) returns the same iterator. This does not conform to the semantics of Iterable.


In any case, it seems unnecessary work to support target typing in for-each loop. If you already have the iterator, you can simply do

    iterator.forEachRemaining( element->{ ... } )

Or if you prefer old-school

    while(iterator.hasNext()) {
Foo elment = iterator.next();

Neither are too bad; it's not worth complicating the language spec even more. (If we do want for-each to provide target typing, remember it needs to work for other poly expressions as well, like ?:; then for-each can become too difficult to understand in some cases. And in general, there are two possible target types, Iterable<? extends X> | X[], which is very difficult for the type inference system.)


The for-each construct could be considered a syntax sugar because lambda wasn't available. If the language already has lambda expression, it is really unnecessary to have a special language construct to support for-each; it can be done by library APIs.

Why do I get: iteration variable in a lambda expression may have unexpected results

The lambda is bound to the variable, not to the value the variable had when the lambda was turned into a delegate. As the loop variable updates, every lambda created bound to that variable also sees the changed value of the variable, which you might not want. Creating a new variable every time you iterate the loop binds each lambda over a new, different, unchanging variable.

This is a major pain point in C# and VB. In C# 5 and VB 11 we are changing the loop closure semantics to mitigate this problem.

For more information see

Is there a reason for C#'s reuse of the variable in a foreach?

and the last few paragraphs of Tim's article:

http://msdn.microsoft.com/en-us/magazine/cc163362.aspx

and my article:

http://ericlippert.com/2009/11/12/closing-over-the-loop-variable-considered-harmful-part-one/

Lambda capture problem with iterators?

You asked for a reference to the specification; the relevant location is section 8.8.4, which states that a "foreach" loop is equivalent to:

    V v;
while (e.MoveNext()) {
v = (V)(T)e.Current;
embedded-statement
}

Note that the value v is declared outside the while loop, and therefore there is a single loop variable. That is then closed over by the lambda.

UPDATE

Because so many people run into this problem the C# design and compiler team changed C# 5 to have these semantics:

    while (e.MoveNext()) {
V v = (V)(T)e.Current;
embedded-statement
}

Which then has the expected behaviour -- you close over a different variable every time. Technically that is a breaking change, but the number of people who depend on the weird behaviour you are experiencing is hopefully very small.

Be aware that C# 2, 3, and 4 are now incompatible with C# 5 in this regard. Also note that the change only applies to foreach, not to for loops.

See http://ericlippert.com/2009/11/12/closing-over-the-loop-variable-considered-harmful-part-one/ for details.


Commenter abergmeier states:

C# is the only language that has this strange behavior.

This statement is categorically false. Consider the following JavaScript:

var funcs = [];
var results = [];
for(prop in { a : 10, b : 20 })
{
funcs.push(function() { return prop; });
results.push(funcs[0]());
}

abergmeier, would you care to take a guess as to what are the contents of results?

Why should variables inside a forEach loop not be changed?

The Java Memory Model has very important property: it guarantees that local variables and method parameters are never writable by another thread. This adds much safety to multi-threading programming. However when you create a lambda (or an anonymous class), nobody knows how it will be used. It can be passed to another thread for execution (for example, if you use parallelStream().forEach(...)). Were it possible to modify the local variable that important property would be violated. Not the thing the Java language developers would sacrifice.

Usually when you are using lambdas, you are trying to program in functional way. But in functional programming mutable variables are considered bad practice: it's better to assign every variable only once. So trying to modify the local variable actually smells. Use various stream reduction methods instead of forEach to produce a good functional code.

Why is the loop variable effectively final when using for-each?

s is never changed (s = ...). So the compiler says "yeah, we could theoretically mark this as final". That is what is meant by effectively final. I.e. you did not mark it final but you could and it would still compile.

In case you are wondering about the enhanced for-loop:

for (String s : arr)

The variable does not live outside of the scope of the for and does not get re-assigned. I.e. it is not:

String s = null;
for (int i = 0; i < arr.length; i++) {
s = arr[i];
...
}

The variable is created inside the loop, so its scope is limited to the loop. It is not re-used but thrown away and re-created each iteration:

for (int i = 0; i < arr.length; i++) {
String s = arr[i];
...
}

Take a close look at the two examples. In the first, you could not write final String s = null;, because we are re-assigning it during the loop s = arr[i];. But in the second example we can, because the variable s is only known within one iteration and then thrown away again. So final String s = arr[i]; is fine.

As a side note, this also explains why you can not use s after the loop. It is unknown and destroyed already, its scope is limited to the loop.

C# - For loop and the lambda expressions

It is hard to tell from your example, because it seems to be perfectly formed and correctly capturing. Here is my expanded version, and it works fine:

    public void Main()
{
List<string> list = new List<string>(3)
{ "a", "b", "c"};
for (int i = 0; i < list.Count; i++)
{
int yy = i;
AFunctionWithLambda(() => Console.WriteLine(list[yy]));
}

Thread.Sleep(1000);
Console.WriteLine("all done, probably");
Console.ReadLine();
}

private void AFunctionWithLambda(Action action)
{ // runs it asynchronously, for giggles
ThreadPool.QueueUserWorkItem(o => {
Thread.Sleep(500); // delay, to let the loop finish
action();
});
}

If you change it to:

        for (int i = 0; i < list.Count; i++)
{
AFunctionWithLambda(() => Console.WriteLine(list[i]));
}

then it fails in the way described separately.

We will need a better example ;p



Related Topics



Leave a reply



Submit