Captured Variable in a Loop in C#

Captured variable in a loop in C#

Yes - take a copy of the variable inside the loop:

while (variable < 5)
{
int copy = variable;
actions.Add(() => copy * 2);
++ variable;
}

You can think of it as if the C# compiler creates a "new" local variable every time it hits the variable declaration. In fact it'll create appropriate new closure objects, and it gets complicated (in terms of implementation) if you refer to variables in multiple scopes, but it works :)

Note that a more common occurrence of this problem is using for or foreach:

for (int i=0; i < 10; i++) // Just one variable
foreach (string x in foo) // And again, despite how it reads out loud

See section 7.14.4.2 of the C# 3.0 spec for more details of this, and my article on closures has more examples too.

Note that as of the C# 5 compiler and beyond (even when specifying an earlier version of C#), the behavior of foreach changed so you no longer need to make local copy. See this answer for more details.

Understanding Local Function Capturing of Variables inside Loops

The critical bit of code that you've moved, string output = iFn.ToString(); turns iFn into a string when it is run

In the first example, it is run after the loop has finished (but "by magic" a single iFn is still available). Of course, after the loop finishes, iFn is 4 (because that's how the loop stopped). Why is it run after the loop finishes? because you "created a method and stored it in a variable" and the call to turn iFn into a string is inside this method. You didn't run the method you created while you were in the loop, so iFn was never turned into a string while you were in the loop. The method-in-a-variable was only run afterwards, and because the method contains the code that turns iFn into a string, it accesses iFn at whatever current value it has, which is 4.

In the second example iFn is turned into a string as the loop is executing so the value is 0, then 1, then 2, then 3. This is stored in a new variable (called output) that the method you created in the loop, has access to (again, let's say the method can still access output, even though it looks like it's out of scope, "by magic"), but the value of output is generated on each pass of the loop and then given to the method. In essence each of your 4 methods-as-a-variable stored in the list has access to a different variable called output - the first method' output has a value of 0, the second method's variable called output has a value of 1..

You could conceive that when you create a method that you will later invoke, the "environment variables" the method had access to at the time it was created, are packaged up with it so that it has its own little environment to execute in. In the first case, each of the 4 methods has access to iFn at whatever value it is now, whereas in the second they have access to output at whatever value it had when it was in the loop. Just because the variable is named the same on each pass of the loop doesn't mean it's reusing the same memory location to hold the data

Captured Closure (Loop Variable) in C# 5.0

What is the reasoning behind this?

I'm going to assume you mean "why wasn't it changed for for loops as well?"

The answer is that for for loops, the existing behaviour makes perfect sense. If you break a for loop into:

  • initializer
  • condition
  • iterator
  • body

... then the loop is roughly:

{
initializer;
while (condition)
{
body;
iterator;
}
}

(Except that the iterator is executed at the end of a continue; statement as well, of course.)

The initialization part logically only happens once, so it's entirely logical that there's only one "variable instantiation". Furthermore, there's no natural "initial" value of the variable on each iteration of the loop - there's nothing to say that a for loop has to be of a form declaring a variable in the initializer, testing it in the condition and modifying it in the iterator. What would you expect a loop like this to do:

for (int i = 0, j = 10; i < j; i++)
{
if (someCondition)
{
j++;
}
actions.Add(() => Console.WriteLine(i, j));
}

Compare that with a foreach loop which looks like you're declaring a separate variable for every iteration. Heck, the variable is read-only, making it even more odd to think of it being one variable which changes between iterations. It makes perfect sense to think of a foreach loop as declaring a new read-only variable on each iteration with its value taken from the iterator.

Captured variable-like error in Parallel.For loop

Why your broken version is broken

The problem appears to be two-fold:

First, you have a variable called finalQuery in an outer scope which you also use in a closure, specifically the one passed in as the body delegate of your Parallel.For, and is therefore the same variable in all iterations of your Parallel.For.

Second, you both read and write this finalQuery variable in that same Parallel.For body, notably with the code:

finalQuery = GetAdaptedBaseQueryWithResearchItemsInserted(finalQuery, ...)

...where you'll see you pass the current value of finalQuery as your base query.

The order in which the various iterations of that loop reach that line of code can change and depends on system architecture and processor load, causing a race condition. Access to your variable is also not governed by a lock.

Why the other version worked

In your working version, finalQuery is a variable that is declared within and therefore entirely local to the Parallel.For body function. This prevents any iterations from seeing values of finalQuery from other iterations. And more importantly, each finalQuery is constructed from a common, invariant base query (query.baseQuery) with this code:

var finalQuery = GetCorrectedQuery(query.BaseQuery, ...)

And although you further adjust the value of finalQuery in the line below:

finalQuery = GetCorrectedQueryWithResearchItems(finalQuery, ...)

...this is fine because this finalQuery variable is local to your lambda function and its value is based solely on the previous line, and fortunately, not from varying values being written by other iterations of the Parallel.For, as was the case in your race condition.

Lambda variable capture in loop - what happens here?

In this line

 listActions.Add(() => Console.WriteLine(i));

the variable i, is captured, or if you wish, created a pointer to the memory location of that variable. That means that every delegate got a pointer to that memory location. After this loop execution:

foreach (int i in Enumerable.Range(1, 10))
{
listActions.Add(() => Console.WriteLine(i));
}

for obvious reasons i is 10, so the memory content that all pointers present in Action(s) are pointing, becomes 10.

In other words, i is captured.

By the way, should note, that according to Eric Lippert, this "strange" behaviour would be resolved in C# 5.0.

So in the C# 5.0 your program would print as expected:

1,2,3,4,5...10

EDIT:

Can not find Eric Lippert's post on subject, but here is another one:

Closure in a Loop Revisited

Captured variables in a thread in a loop in C#, what is the solution?

came across this example that demonstrates the case of Captured Variables within a Thread and a loop

Note C# will have a functional change (in C#6 IIRC): C# will automatically generate separate values to capture (because that is what you always want).

yet the sequence is still not ordered,

Of course it isn't. You cannot control the order in which threads are scheduled.

, is there a trick or a practice I'm missing here to show a complete correct sequence

You need to reorder the results as the threads complete, or – if the processing is small – don't use threads. (Threads are, on Win32, quite expensive things to create, only use if you are going to do substantive work, and even then the thread pool or Task Parallel Library, TPL, are better options.)

C# Closures, why is the loopvariable captured by reference?

Well, that's just how C# works. The lambda expression in your statement constructs a lexical closure, which stores a single reference to i that persists even after the loop has concluded.

To remedy it, you can do just the thing that you did.

Feel free to read more on this particular issue all around the Web; my choice would be Eric Lippert's discussion here.

C# Incorrect arguments supplied when initializing tasks from a for loop

The problem is that the task has a reference to the integer. So it will use the value the integer has when the taks is started and not when the taks is created.

To fix it assign the integer to a local variable just before the task is created.

        for (int i = 1; i <= x; i++)
{
var localValue = i;
taskList.Add(Task.Factory.StartNew(() => newMethodForThreads(localValue)));
}


Related Topics



Leave a reply



Submit