Why Can't I Put a Variable Declaration in the Test Portion of a While Loop

Why can't I put a variable declaration in the test portion of a while loop?

The grammar for a condition in the '03 standard is defined as follows:

condition:
  expression
  type-specifier-seq declarator = assignment-expression

The above will therefore only allow conditions such as:

if ( i && j && k ) {}
if ( (i = j) ==0 ) {}
if ( int i = j ) {}

The standard allows the condition to declare a variable, however, they have done so by adding a new grammar rule called 'condition' that can be an expression or a declarator with an initializer. The result is that just because you are in the condition of an if, for, while, or switch does not mean that you can declare a variable inside an expression.

Why can't you declare a variable inside the expression portion of a do while loop?

It seems like scoping would be the issue, what would be the scope of i declared in the while portion of a do while statement? It would seem rather unnatural to have a variable available within the loop when the declaration is actually below the loop itself. You don't have this issue with the other loops since the declarations comes before the body of the loop.

If we look at the draft C++ standard section [stmt.while]p2 we see that for the while statement that:

while (T t = x) statement

is equivalent to:

label:
{ // start of condition scope
    T t = x;
    if (t) {
        statement
    goto label;
    }
} // end of condition scope

and:

The variable created in a condition is destroyed and created with each iteration of the loop.

How would we formulate this for the do while case?

and as cdhowie points out if we look at section [stmt.do]p2 it says (emphasis mine):

In the do statement the substatement is executed repeatedly until the
value of the expression becomes false. The test takes place after each
execution of the statement.

which means the body of the loop is evaluated before we would even reach the declaration.

While we could create an exception for this case it would violate our intuitive sense that in general the point of declaration for a name is after we see the complete declaration(with some exceptions for example class member variables) with unclear benefits. Point of declaration is covered in section 3.3.2.

Why can't I define a variable inside a for loop in this way?

The first way is not legal, as it is clear to the compiler that you can't use j that you declared there, as you can't have another statement inside that for loop. Basically the new declaration of variable at that place will go out of scope the very next statement, thus is not doing any good.

While in the second case, the loop is followed by braces, which creates a new scope, and you can use the variable.

Declaring variables inside or outside of a loop

The scope of local variables should always be the smallest possible.

In your example I presume str is not used outside of the while loop, otherwise you would not be asking the question, because declaring it inside the while loop would not be an option, since it would not compile.

So, since str is not used outside the loop, the smallest possible scope for str is within the while loop.

So, the answer is emphatically that str absolutely ought to be declared within the while loop. No ifs, no ands, no buts.

The only case where this rule might be violated is if for some reason it is of vital importance that every clock cycle must be squeezed out of the code, in which case you might want to consider instantiating something in an outer scope and reusing it instead of re-instantiating it on every iteration of an inner scope. However, this does not apply to your example, due to the immutability of strings in java: a new instance of str will always be created in the beginning of your loop and it will have to be thrown away at the end of it, so there is no possibility to optimize there.

EDIT: (injecting my comment below in the answer)

In any case, the right way to do things is to write all your code properly, establish a performance requirement for your product, measure your final product against this requirement, and if it does not satisfy it, then go optimize things. And what usually ends up happening is that you find ways to provide some nice and formal algorithmic optimizations in just a couple of places which make our program meet its performance requirements instead of having to go all over your entire code base and tweak and hack things in order to squeeze clock cycles here and there.

Use variables declared inside do-while loop in the condition

There is a reason why this can't be possible. It is due to the limitation of "statement-scope".

Your variables i and j have been declared with "local scope" -- that is variables inside {} brackets. You actually wanted j to be declared with "statement scope" but this is not possible.

Statement-scope are those variables declared as part of 'for', 'while', 'if' or 'switch' statements. Statement scope does not cover do-while statements, though, which is why you cannot do this.

You have basically exposed a language drawback of using do-while.

It would be better if the language offered:

do {
.
.
.
} while (int j < 100);

but it does not offer this.

declaration for variable in while condition in javascript

The question is a little dated, but I think the answers all miss an important distinction. That is, a while loop expects an expression that evaluates to a conditional, i.e., a boolean or value that can be converted to a boolean. See Mozilla docs for details.

A pure assignment (without instantiation) is coerced to a boolean via its default return value (the value of the right-hand-side).

A var (or let or const) is a statement that allows an optional assignment but has a return value of undefined.

You can easily test this in your console:

var foo = 42; // undefined
bar = 42      // 42

The return values alone don't answer the question, since undefined is falsey, but does show that even if JS let you put a var in a conditional it would simply always evaluate to false.

Others have mentioned for statements and that they allow declaration and instantiation of variables. This is true, but the documentation explains that for expects a statement or assigment.

Opinions may vary, but for me all this adds up to an understandable consistency not a quirk in behavior with regard to loops. A while loop is better thought of as a looping version of an if statement than akin to a for loop. If there is quirkiness in all of this, it's the for statement's wholesale divergence from the language's normal syntax.

Why do I not have to define the variable in a for loop using range(), but I do have to in a while loop in Python?

I'd like to approach this question from a slightly different perspective.

If we look at the official Python grammar specification, we can see that (approximately speaking), a while statement takes a test, while a for statement takes an exprlist and testlist.

Conceptually, then, we can understand that a while statement needs one thing: an expression that it can repeatedly evaluate.

On the other hand, a for statement needs two: a collection of expressions to be evaluated, as well as a number of names to bind the results of those evaluations to.

With this in mind, it makes sense that a while statement would not automatically create a temporary variable, since it can accept literals too. Conversely, a for statement must bind to some names.

(Strictly speaking, it is valid, in terms of Python grammar, to put a literal where you would expect a name in a for statement, but contextually that wouldn't make sense, so the language prohibits it.)

Declaring variables inside loops, good practice or bad practice?

This is excellent practice.

By creating variables inside loops, you ensure their scope is restricted to inside the loop. It cannot be referenced nor called outside of the loop.

This way:

If the name of the variable is a bit "generic" (like "i"), there is no risk to mix it with another variable of same name somewhere later in your code (can also be mitigated using the -Wshadow warning instruction on GCC)
The compiler knows that the variable scope is limited to inside the loop, and therefore will issue a proper error message if the variable is by mistake referenced elsewhere.
Last but not least, some dedicated optimization can be performed more efficiently by the compiler (most importantly register allocation), since it knows that the variable cannot be used outside of the loop. For example, no need to store the result for later re-use.

In short, you are right to do it.

Note however that the variable is not supposed to retain its value between each loop. In such case, you may need to initialize it every time. You can also create a larger block, encompassing the loop, whose sole purpose is to declare variables which must retain their value from one loop to another. This typically includes the loop counter itself.

{
    int i, retainValue;
    for (i=0; i<N; i++)
    {
       int tmpValue;
       /* tmpValue is uninitialized */
       /* retainValue still has its previous value from previous loop */

       /* Do some stuff here */
    }
    /* Here, retainValue is still valid; tmpValue no longer */
}

For question #2:
The variable is allocated once, when the function is called. In fact, from an allocation perspective, it is (nearly) the same as declaring the variable at the beginning of the function. The only difference is the scope: the variable cannot be used outside of the loop. It may even be possible that the variable is not allocated, just re-using some free slot (from other variable whose scope has ended).

With restricted and more precise scope come more accurate optimizations. But more importantly, it makes your code safer, with less states (i.e. variables) to worry about when reading other parts of the code.

This is true even outside of an if(){...} block. Typically, instead of :

    int result;
    (...)
    result = f1();
    if (result) then { (...) }
    (...)
    result = f2();
    if (result) then { (...) }

it's safer to write :

    (...)
    {
        int const result = f1();
        if (result) then { (...) }
    }
    (...)
    {
        int const result = f2();
        if (result) then { (...) }
    }

The difference may seem minor, especially on such a small example.
But on a larger code base, it will help : now there is no risk to transport some result value from f1() to f2() block. Each result is strictly limited to its own scope, making its role more accurate. From a reviewer perspective, it's much nicer, since he has less long range state variables to worry about and track.

Even the compiler will help better : assuming that, in the future, after some erroneous change of code, result is not properly initialized with f2(). The second version will simply refuse to work, stating a clear error message at compile time (way better than run time). The first version will not spot anything, the result of f1() will simply be tested a second time, being confused for the result of f2().

Complementary information

The open-source tool CppCheck (a static analysis tool for C/C++ code) provides some excellent hints regarding optimal scope of variables.

In response to comment on allocation:
The above rule is true in C, but might not be for some C++ classes.

For standard types and structures, the size of variable is known at compilation time. There is no such thing as "construction" in C, so the space for the variable will simply be allocated into the stack (without any initialization), when the function is called. That's why there is a "zero" cost when declaring the variable inside a loop.

However, for C++ classes, there is this constructor thing which I know much less about. I guess allocation is probably not going to be the issue, since the compiler shall be clever enough to reuse the same space, but the initialization is likely to take place at each loop iteration.

declare variable inside or outside a loop, does it make big difference?

Yes. scope of variable i is different in both cases.

In first case, A variable i declared in a block or function. So, you can access it in the block or function.

In the second case, A variable I declared in a while loop. So, you can access it in while loop only.

Does it make big difference in terms of performance?

No, it will not matter performance-wise where you declare it.

For example 1:

int main()
{
    int i, bigNumber;

    while(bigNumber--) {
        i = 0;
    }
}

Assembly:

main:
  push rbp
  mov rbp, rsp
.L3:
  mov eax, DWORD PTR [rbp-4]
  lea edx, [rax-1]
  mov DWORD PTR [rbp-4], edx
  test eax, eax
  setne al
  test al, al
  je .L2
  mov DWORD PTR [rbp-8], 0
  jmp .L3
.L2:
  mov eax, 0
  pop rbp
  ret

Example 2:

int main()
{
    int bigNumber;

    while(bigNumber--) {
        int i;
        i = 0;
    }
}

Assembly:

main:
  push rbp
  mov rbp, rsp
.L3:
  mov eax, DWORD PTR [rbp-4]
  lea edx, [rax-1]
  mov DWORD PTR [rbp-4], edx
  test eax, eax
  setne al
  test al, al
  je .L2
  mov DWORD PTR [rbp-8], 0
  jmp .L3
.L2:
  mov eax, 0
  pop rbp
  ret

Both generate the same assembly code.

Difference between declaring variables before or in loop?

Which is better, a or b?

From a performance perspective, you'd have to measure it. (And in my opinion, if you can measure a difference, the compiler isn't very good).

From a maintenance perspective, b is better. Declare and initialize variables in the same place, in the narrowest scope possible. Don't leave a gaping hole between the declaration and the initialization, and don't pollute namespaces you don't need to.

Why Can't I Put a Variable Declaration in the Test Portion of a While Loop