Declaring a Variable Inside or Outside an Foreach Loop: Which Is Faster/Better

Declaring a variable inside or outside an foreach loop: which is faster/better?

Performance-wise both examples are compiled to the same IL, so there's no difference.

The second is better, because it more clearly expresses your intent if u is only used inside the loop.

Is it better to declare a variable inside or outside a loop?

Performance-wise, let's try concrete examples:

public void Method1()
{
foreach(int i in Enumerable.Range(0, 10))
{
int x = i * i;
StringBuilder sb = new StringBuilder();
sb.Append(x);
Console.WriteLine(sb);
}
}
public void Method2()
{
int x;
StringBuilder sb;
foreach(int i in Enumerable.Range(0, 10))
{
x = i * i;
sb = new StringBuilder();
sb.Append(x);
Console.WriteLine(sb);
}
}

I deliberately picked both a value-type and a reference-type in case that affects things. Now, the IL for them:

.method public hidebysig instance void Method1() cil managed
{
.maxstack 2
.locals init (
[0] int32 i,
[1] int32 x,
[2] class [mscorlib]System.Text.StringBuilder sb,
[3] class [mscorlib]System.Collections.Generic.IEnumerator`1<int32> enumerator)
L_0000: ldc.i4.0
L_0001: ldc.i4.s 10
L_0003: call class [mscorlib]System.Collections.Generic.IEnumerable`1<int32> [System.Core]System.Linq.Enumerable::Range(int32, int32)
L_0008: callvirt instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> [mscorlib]System.Collections.Generic.IEnumerable`1<int32>::GetEnumerator()
L_000d: stloc.3
L_000e: br.s L_002f
L_0010: ldloc.3
L_0011: callvirt instance !0 [mscorlib]System.Collections.Generic.IEnumerator`1<int32>::get_Current()
L_0016: stloc.0
L_0017: ldloc.0
L_0018: ldloc.0
L_0019: mul
L_001a: stloc.1
L_001b: newobj instance void [mscorlib]System.Text.StringBuilder::.ctor()
L_0020: stloc.2
L_0021: ldloc.2
L_0022: ldloc.1
L_0023: callvirt instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(int32)
L_0028: pop
L_0029: ldloc.2
L_002a: call void [mscorlib]System.Console::WriteLine(object)
L_002f: ldloc.3
L_0030: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
L_0035: brtrue.s L_0010
L_0037: leave.s L_0043
L_0039: ldloc.3
L_003a: brfalse.s L_0042
L_003c: ldloc.3
L_003d: callvirt instance void [mscorlib]System.IDisposable::Dispose()
L_0042: endfinally
L_0043: ret
.try L_000e to L_0039 finally handler L_0039 to L_0043
}

.method public hidebysig instance void Method2() cil managed
{
.maxstack 2
.locals init (
[0] int32 x,
[1] class [mscorlib]System.Text.StringBuilder sb,
[2] int32 i,
[3] class [mscorlib]System.Collections.Generic.IEnumerator`1<int32> enumerator)
L_0000: ldc.i4.0
L_0001: ldc.i4.s 10
L_0003: call class [mscorlib]System.Collections.Generic.IEnumerable`1<int32> [System.Core]System.Linq.Enumerable::Range(int32, int32)
L_0008: callvirt instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> [mscorlib]System.Collections.Generic.IEnumerable`1<int32>::GetEnumerator()
L_000d: stloc.3
L_000e: br.s L_002f
L_0010: ldloc.3
L_0011: callvirt instance !0 [mscorlib]System.Collections.Generic.IEnumerator`1<int32>::get_Current()
L_0016: stloc.2
L_0017: ldloc.2
L_0018: ldloc.2
L_0019: mul
L_001a: stloc.0
L_001b: newobj instance void [mscorlib]System.Text.StringBuilder::.ctor()
L_0020: stloc.1
L_0021: ldloc.1
L_0022: ldloc.0
L_0023: callvirt instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(int32)
L_0028: pop
L_0029: ldloc.1
L_002a: call void [mscorlib]System.Console::WriteLine(object)
L_002f: ldloc.3
L_0030: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
L_0035: brtrue.s L_0010
L_0037: leave.s L_0043
L_0039: ldloc.3
L_003a: brfalse.s L_0042
L_003c: ldloc.3
L_003d: callvirt instance void [mscorlib]System.IDisposable::Dispose()
L_0042: endfinally
L_0043: ret
.try L_000e to L_0039 finally handler L_0039 to L_0043
}

As you can see, apart from the order on the stack the compiler happened to choose - which could just as well have been a different order - it had absolutely no effect. In turn, there really isn't anything that one is giving the jitter to make much use of that the other isn't giving it.

Other than that, there is one sort-of difference.

In my Method1(), x and sb are scoped to the foreach, and cannot be accessed either deliberately or accidentally outside of it.

In my Method2(), x and sb are not known at compile-time to be reliably assigned a value within the foreach (the compiler doesn't know the foreach will perform at least one loop), so use of it is forbidden.

So far, no real difference.

I can however assign and use x and/or sb outside of the foreach. As a rule I would say that this is probably poor scoping most of the time, so I'd favour Method1, but I might have some sensible reason to want to refer to them (more realistically if they weren't possibly unassigned), in which case I'd go for Method2.

Still, that's a matter of how the each code can be extended or not, not a difference of the code as written. Really, there's no difference.

Difference between declaring variables before or in loop?

Which is better, a or b?

From a performance perspective, you'd have to measure it. (And in my opinion, if you can measure a difference, the compiler isn't very good).

From a maintenance perspective, b is better. Declare and initialize variables in the same place, in the narrowest scope possible. Don't leave a gaping hole between the declaration and the initialization, and don't pollute namespaces you don't need to.

Declaring variables inside or outside of a loop

The scope of local variables should always be the smallest possible.

In your example I presume str is not used outside of the while loop, otherwise you would not be asking the question, because declaring it inside the while loop would not be an option, since it would not compile.

So, since str is not used outside the loop, the smallest possible scope for str is within the while loop.

So, the answer is emphatically that str absolutely ought to be declared within the while loop. No ifs, no ands, no buts.

The only case where this rule might be violated is if for some reason it is of vital importance that every clock cycle must be squeezed out of the code, in which case you might want to consider instantiating something in an outer scope and reusing it instead of re-instantiating it on every iteration of an inner scope. However, this does not apply to your example, due to the immutability of strings in java: a new instance of str will always be created in the beginning of your loop and it will have to be thrown away at the end of it, so there is no possibility to optimize there.

EDIT: (injecting my comment below in the answer)

In any case, the right way to do things is to write all your code properly, establish a performance requirement for your product, measure your final product against this requirement, and if it does not satisfy it, then go optimize things. And what usually ends up happening is that you find ways to provide some nice and formal algorithmic optimizations in just a couple of places which make our program meet its performance requirements instead of having to go all over your entire code base and tweak and hack things in order to squeeze clock cycles here and there.

Faster to declare variables inside a loop or outside a loop?

I agree with Kevin's answer, define variables where they have meaning. Worry about optimizations if and when they present themselves and you know that a variable declaration is the issue. However, consider the following two pieces of code

void Test1()
{
foreach (int i in Enumerable.Range(0,10))
{
string s = GetString();
Console.WriteLine(s);
}
}

void Test2()
{
string s;
foreach (int i in Enumerable.Range(0,10))
{
s = GetString();
Console.WriteLine(s);
}
}

And their generated IL:

Test1:
IL_0000: ldc.i4.0
IL_0001: ldc.i4.s 0A
IL_0003: call System.Linq.Enumerable.Range
IL_0008: callvirt System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator
IL_000D: stloc.1
IL_000E: br.s IL_0024
IL_0010: ldloc.1
IL_0011: callvirt System.Collections.Generic.IEnumerator<System.Int32>.get_Current
IL_0016: pop
IL_0017: ldarg.0
IL_0018: call UserQuery.GetString
IL_001D: stloc.0
IL_001E: ldloc.0
IL_001F: call System.Console.WriteLine
IL_0024: ldloc.1
IL_0025: callvirt System.Collections.IEnumerator.MoveNext
IL_002A: brtrue.s IL_0010
IL_002C: leave.s IL_0038
IL_002E: ldloc.1
IL_002F: brfalse.s IL_0037
IL_0031: ldloc.1
IL_0032: callvirt System.IDisposable.Dispose
IL_0037: endfinally
IL_0038: ret

Test2:
IL_0000: ldc.i4.0
IL_0001: ldc.i4.s 0A
IL_0003: call System.Linq.Enumerable.Range
IL_0008: callvirt System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator
IL_000D: stloc.1
IL_000E: br.s IL_0024
IL_0010: ldloc.1
IL_0011: callvirt System.Collections.Generic.IEnumerator<System.Int32>.get_Current
IL_0016: pop
IL_0017: ldarg.0
IL_0018: call UserQuery.GetString
IL_001D: stloc.0
IL_001E: ldloc.0
IL_001F: call System.Console.WriteLine
IL_0024: ldloc.1
IL_0025: callvirt System.Collections.IEnumerator.MoveNext
IL_002A: brtrue.s IL_0010
IL_002C: leave.s IL_0038
IL_002E: ldloc.1
IL_002F: brfalse.s IL_0037
IL_0031: ldloc.1
IL_0032: callvirt System.IDisposable.Dispose
IL_0037: endfinally
IL_0038: ret

See any difference? Those compiler guys, they're smart.

JavaScript variables declare outside or inside loop?

There is absolutely no difference in meaning or performance, in JavaScript or ActionScript.

var is a directive for the parser, and not a command executed at run-time. If a particular identifier has been declared var once or more anywhere in a function body(*), then all use of that identifier in the block will be referring to the local variable. It makes no difference whether value is declared to be var inside the loop, outside the loop, or both.

Consequently you should write whichever you find most readable. I disagree with Crockford that putting all the vars at the top of a function is always the best thing. For the case where a variable is used temporarily in a section of code, it's better to declare var in that section, so the section stands alone and can be copy-pasted. Otherwise, copy-paste a few lines of code to a new function during refactoring, without separately picking out and moving the associated var, and you've got yourself an accidental global.

In particular:

for (var i; i<100; i++)
do something;

for (var i; i<100; i++)
do something else;

Crockford will recommend you remove the second var (or remove both vars and do var i; above), and jslint will whinge at you for this. But IMO it's more maintainable to keep both vars, keeping all the related code together, instead of having an extra, easily-forgotten bit of code at the top of the function.

Personally I tend to declare as var the first assignment of a variable in an independent section of code, whether or not there's another separate usage of the same variable name in some other part of the same function. For me, having to declare var at all is an undesirable JS wart (it would have been better to have variables default to local); I don't see it as my duty to duplicate the limitations of [an old revision of] ANSI C in JavaScript as well.

(*: other than in nested function bodies)

in-loop vs out-of-loop variable declaration

If you are only going to use that variable inside of the loop, it is better to declare it inside. That way when you move onto the next iteration of the loop the memory used can be cleared up. Otherwise you would have to wait until the end of the method it was declared in, or for when the Object it is a member of becomes eligible for garbage collection. This is subtly different for primitive variables (as apposed to your String object) which will always be cleared up after the method ends anyway.

In other words, the scope of a variable should always be as small as practically possible to conserve memory (as well as other reasons).

See this answer for more details.

As for performance in speed, there shouldn't be any difference between declaring it inside the loop or outside. As confirmed by bytecode analysis here, and a comprehensive logical analysis here.

I hope this helps.

Is it better coding practice to define variables outside a foreach even though more verbose?

The second form is no more wasteful - it's simply better.

There's no advantage to declaring the variables outside the loop, unless you want to maintain their values between iterations.

(Note that usually this makes no behavioural difference, but that's not true if the variables are being captured by a lambda expression or anonymous method.)

Is it better to declare a variable inside the loop or outside the loop?

I like to limit variable scope as much as possible. The first option scopes the variable to the entire containing function, while the latter limits it to just within the loop. Therefore, I prefer the latter unless I explicitly need access to the variable after the loop completes.



Related Topics



Leave a reply



Submit