Closures Behaving Differently in for and Foreach Loops

Closures behaving differently in for and foreach loops

In C# 5 and beyond, the foreach loop declares a separate i variable for each iteration of the loop. So each closure captures a separate variable, and you see the expected results.

In the for loop, you only have a single i variable, which is captured by all the closures, and modified as the loop progresses - so by the time you call the delegates, you see the final value of that single variable.

In C# 2, 3 and 4, the foreach loop behaved that way as well, which was basically never the desired behaviour, so it was fixed in C# 5.

You can achieve the same effect in the for loop if you introduce a new variable within the scope of the loop body:

for (int i = 3; i <= 4; i++)
{
int copy = i;
actions.Add(() => Console.WriteLine(copy));
}

For more details, read Eric Lippert's blog posts, "Closing over the loop variable considered harmful" - part 1, part 2.

How do closures differ between foreach and list.ForEach()?

foreach only introduces one variable. While the lambda parameter variable is "fresh" each time it is invoked.

Compare with:

foreach (var v1 in values) // v1 *same* variable each loop, value changed
{
var freshV1 = v1; // freshV1 is *new* variable each loop
funcs.Add(() => freshV1);
}

foreach (var func in funcs)
{
Console.WriteLine(func()); //prints 123,432,768
}

That is,

foreach (T v in ...) { }

can be thought of as:

T v;
foreach(v in ...) {}

Happy coding.

JavaScript closure inside loops – simple practical example

Well, the problem is that the variable i, within each of your anonymous functions, is bound to the same variable outside of the function.

ES6 solution: let

ECMAScript 6 (ES6) introduces new let and const keywords that are scoped differently than var-based variables. For example, in a loop with a let-based index, each iteration through the loop will have a new variable i with loop scope, so your code would work as you expect. There are many resources, but I'd recommend 2ality's block-scoping post as a great source of information.

for (let i = 0; i < 3; i++) {
funcs[i] = function() {
console.log("My value: " + i);
};
}

Beware, though, that IE9-IE11 and Edge prior to Edge 14 support let but get the above wrong (they don't create a new i each time, so all the functions above would log 3 like they would if we used var). Edge 14 finally gets it right.



ES5.1 solution: forEach

With the relatively widespread availability of the Array.prototype.forEach function (in 2015), it's worth noting that in those situations involving iteration primarily over an array of values, .forEach() provides a clean, natural way to get a distinct closure for every iteration. That is, assuming you've got some sort of array containing values (DOM references, objects, whatever), and the problem arises of setting up callbacks specific to each element, you can do this:

var someArray = [ /* whatever */ ];
// ...
someArray.forEach(function(arrayElement) {
// ... code code code for this one element
someAsynchronousFunction(arrayElement, function() {
arrayElement.doSomething();
});
});

The idea is that each invocation of the callback function used with the .forEach loop will be its own closure. The parameter passed in to that handler is the array element specific to that particular step of the iteration. If it's used in an asynchronous callback, it won't collide with any of the other callbacks established at other steps of the iteration.

If you happen to be working in jQuery, the $.each() function gives you a similar capability.



Classic solution: Closures

What you want to do is bind the variable within each function to a separate, unchanging value outside of the function:

var funcs = [];

function createfunc(i) {
return function() {
console.log("My value: " + i);
};
}

for (var i = 0; i < 3; i++) {
funcs[i] = createfunc(i);
}

for (var j = 0; j < 3; j++) {
// and now let's run each one to see
funcs[j]();
}

Array.forEach() and closure

I realize there is a need to remove closures in the code to optimize the memory use.

For memory use, only the closures that you store somewhere count. When you get memory problems, you should check whether there are classes with lots of instances which each have their own closure instances. It does not mean that you should avoid closures in general.

One code pattern of mine is to use Array.forEach() as much as possible

Don't. Given that you are using ES6, you should be using for … of as much as possible (for imperative loops).

Apparently the callback function used in the Array.forEach() creates closures

Yes, but in the examples you've shown they're not avoidable (cannot be moved out into static functions). And given they last only as long as the forEach call and will be garbage-collected immediately after, there's no memory strain either.

However, as the article you linked explains, closures are still costly to create (compared to not creating them at all).

Should I go back to for loop in performance-sensitive projects?

Yes, definitely (at least in really performance-sensitive locations - no whole project will be). However the reason is not the cost of closures, it is the call overhead of functions in general that forEach cannot completely optimise.

Why would for loop and forEach work different?

Return statement is equivalent of break, for loop can be broken, foreach can't be.

Access to foreach variable in closure warning

There are two parts to this warning. The first is...

Access to foreach variable in closure

...which is not invalid per se but it is counter-intuitive at first glance. It's also very hard to do right. (So much so that the article I link to below describes this as "harmful".)

Take your query, noting that the code you've excerpted is basically an expanded form of what the C# compiler (before C# 5) generates for foreach1:

I [don't] understand why [the following is] not valid:

string s; while (enumerator.MoveNext()) { s = enumerator.Current; ...

Well, it is valid syntactically. And if all you're doing in your loop is using the value of s then everything is good. But closing over s will lead to counter-intuitive behaviour. Take a look at the following code:

var countingActions = new List<Action>();

var numbers = from n in Enumerable.Range(1, 5)
select n.ToString(CultureInfo.InvariantCulture);

using (var enumerator = numbers.GetEnumerator())
{
string s;

while (enumerator.MoveNext())
{
s = enumerator.Current;

Console.WriteLine("Creating an action where s == {0}", s);
Action action = () => Console.WriteLine("s == {0}", s);

countingActions.Add(action);
}
}

If you run this code, you'll get the following console output:

Creating an action where s == 1
Creating an action where s == 2
Creating an action where s == 3
Creating an action where s == 4
Creating an action where s == 5

This is what you expect.

To see something you probably don't expect, run the following code immediately after the above code:

foreach (var action in countingActions)
action();

You'll get the following console output:

s == 5
s == 5
s == 5
s == 5
s == 5

Why? Because we created five functions that all do the exact same thing: print the value of s (which we've closed over). In reality, they're the same function ("Print s", "Print s", "Print s"...).

At the point at which we go to use them, they do exactly what we ask: print the value of s. If you look at the last known value of s, you'll see that it's 5. So we get s == 5 printed five times to the console.

Which is exactly what we asked for, but probably not what we want.

The second part of the warning...

May have different behaviour when compiled with different versions of compiler.

...is what it is. Starting with C# 5, the compiler generates different code that "prevents" this from happening via foreach.

Thus the following code will produce different results under different versions of the compiler:

foreach (var n in numbers)
{
Action action = () => Console.WriteLine("n == {0}", n);
countingActions.Add(action);
}

Consequently, it will also produce the R# warning :)

My first code snippet, above, will exhibit the same behaviour in all versions of the compiler, since I'm not using foreach (rather, I've expanded it out the way pre-C# 5 compilers do).

Is this for CLR version?

I'm not quite sure what you're asking here.

Eric Lippert's post says the change happens "in C# 5". So presumably you have to target .NET 4.5 or later with a C# 5 or later compiler to get the new behaviour, and everything before that gets the old behaviour.

But to be clear, it's a function of the compiler and not the .NET Framework version.

Is there relevance with IL?

Different code produces different IL so in that sense there's consequences for the IL generated.

1 foreach is a much more common construct than the code you've posted in your comment. The issue typically arises through use of foreach, not through manual enumeration. That's why the changes to foreach in C# 5 help prevent this issue, but not completely.

Is a Foreach loop a type of block?

This is most definitely not language agnostic (as it is tagged).

For example, in Ruby, you are passing a closure or block to the .each method. The semantics are defined by the language.

In C#, foreach compiles to a call to IEnumerable.GetEnumerator and the use of the IEnumerator.MoveNext method. It just depends on the language.

EDIT:

Edit: This question is confusing but yet got some attention so I can't delete it. Anyway... by Blocks I mean "closures" and not just code blocks. As far as why I tagged this as language agnostic is that I'm not interested in how a for-each is implemented under-the-hood. I'm more interested in whether, from a programmer's perspective, a for-each could be considered a freebie closure in languages that doesn't have them, like Java.

No, you are thinking only of one use case for a closure. You can't pass a control structure to a method as you can a closure in some languages. A control structure is not a first class first class data type like a closure is in some languages. In the case of "run this code on each object in a collection" yes, they are semantically similar. However, that it only one example of what you can do with a closure.

You say that you w

in a forEach closure loop with two iteration variables, what does each variable represent?

When using enumerated.forEach you get the offset and the element.

Note, as per Martin’s comment. The offset is different from the index and so may not necessarily be the actual index.

The foreach identifier and closures

Edit: this all changes in C# 5, with a change to where the variable is defined (in the eyes of the compiler). From C# 5 onwards, they are the same.


Before C#5

The second is safe; the first isn't.

With foreach, the variable is declared outside the loop - i.e.

Foo f;
while(iterator.MoveNext())
{
f = iterator.Current;
// do something with f
}

This means that there is only 1 f in terms of the closure scope, and the threads might very likely get confused - calling the method multiple times on some instances and not at all on others. You can fix this with a second variable declaration inside the loop:

foreach(Foo f in ...) {
Foo tmp = f;
// do something with tmp
}

This then has a separate tmp in each closure scope, so there is no risk of this issue.

Here's a simple proof of the problem:

    static void Main()
{
int[] data = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
foreach (int i in data)
{
new Thread(() => Console.WriteLine(i)).Start();
}
Console.ReadLine();
}

Outputs (at random):

1
3
4
4
5
7
7
8
9
9

Add a temp variable and it works:

        foreach (int i in data)
{
int j = i;
new Thread(() => Console.WriteLine(j)).Start();
}

(each number once, but of course the order isn't guaranteed)

Is LINQifying my code worth accessing a foreach variable in a closure?

You have two different issues here, one LINQ vs foreach, and the other is a different case.

Regarding the ReSharper informing you of "Access to foreach variable in closure..." when the code is LINQified - I just never take my chances, and leave it as a foreach loop. In most cases it is also more readable and maintainable, and really, shortening the code isn't that much of a big deal.

Regarding the second case - you'll need to lose the using statement, since the db object will be disposed too soon. You should close and dispose it in the "old school fashion" INSIDE the RunInTransaction lambda expression, at the end of it.



Related Topics



Leave a reply



Submit