Order of Evaluation of Assignment Statement in C++

C order of evaluation of assignment statement

The semantics for evaluation of an = expression include that

The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. The evaluations of the operands are unsequenced.

(C2011, 6.5.16/3; emphasis added)

The emphasized provision explicitly permits your observed difference in the behavior of the program when compiled by different compilers. Moreover, unsequenced means, among other things, that it is permissible for the evaluations to occur in different order even in different runs of the very same build of the program. If the function in which the unsequenced evaluations appear were called more than once, then it would be permissible for the evaluations to occur in different order during different calls within the same execution of the program.

That already answers the question, but it's important to see the bigger picture. Modifying an object or calling a function that does so is a side effect (C2011, 5.1.2.3/2). This key provision therefore comes into play:

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

(C2011, 6.5/2)

The called function has the side effect of modifying the value stored in main()'s variable pmm, evaluation of the left-hand operand of the assignment involves a value computation using the value of pmm, and these are unsequenced, therefore the behavior is undefined.

Undefined behavior is to be avoided at all costs. Because your program's behavior is undefined, is not limited to the two alternatives you observed (in case that wasn't bad enough). The C standard places no limitations whatever on what it may do. It might instead crash, zero out your hard drive's partition table, or, if you have suitable hardware, summon nasal demons. Or anything else. Most of these are unlikely, but the best viewpoint is that if your program has undefined behavior then your program is wrong.

Order of evaluation of assignment statement in C++

Yes, this is covered by the standard and it is unspecified behavior. This particular case is covered in a recent C++ standards proposal: N4228: Refining Expression Evaluation Order for Idiomatic C++ which seeks to refine the order of evaluation rules to make it well specified for certain cases.

It describes this problem as follows:

Expression evaluation order is a recurring discussion topic in the C++
community. In a nutshell, given an expression such as f(a, b,
c)
, the order in which the sub-expressions f, a, b, c are evaluated is left unspecified by the standard. If any two of these sub-expressions happen to modify the same object without intervening sequence points, the behavior of the program is undefined. For instance, the expression f(i++, i) where i is an
integer variable leads to undefined behavior , as does v[i]
= i++
. Even when the behavior is not undefined, the result of evaluating an expression can still be anybody’s guess. Consider
the following program fragment:

#include <map>

int main() {
std::map<int, int> m;
m[0] = m.size(); // #1
}

What should the map object m look like after evaluation of the
statement marked #1? { {0, 0 } } or {{0, 1 } } ?

We know that unless specified the evaluations of sub-expressions are unsequenced, this is from the draft C++11 standard section 1.9 Program execution which says:

Except where noted, evaluations of operands of individual operators
and of subexpressions of individual expressions are unsequenced.[...]

and all the section 5.17 Assignment and compound assignment operators [expr.ass] says is:

[...]In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.[...]

So this section does not nail down the order of evaluation but we know this is not undefined behavior since both operator [] and size() are function calls and section 1.9 tells us(emphasis mine):

[...]When calling a function (whether or not the function is inline), every value computation and side effect
associated with any argument expression, or with the postfix expression designating the called function, is
sequenced before execution of every expression or statement in the body of the called function. [ Note: Value
computations and side effects associated with different argument expressions are unsequenced. —end note ]
Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function
.9[...]

Note, I cover the second interesting example from the N4228 proposal in the question Does this code from “The C++ Programming Language” 4th edition section 36.3.6 have well-defined behavior?.

Update

It seems like a revised version of N4228 was accepted by the Evolution Working Group at the last WG21 meeting but the paper(P0145R0) is not yet available. So this could possibly no longer be unspecified in C++17.

Update 2

Revision 3 of p0145 made this specified and update [expr.ass]p1:

The assignment operator (=) and the compound assignment operators all group right-to-left.
All require a modifiable lvalue as their left operand; their result is an lvalue referring to the left operand.
The result in all cases is a bit-field if the left operand is a bit-field.
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand. ...

Evaluation order of side effects for assignment operator in C++11

First of all, note that C++17 introduced quite some changes to expression evaluation order.

Let's first see what the current standard draft has to say. I guess relevant here should be [intro.execution]/7

[…] Evaluation of an expression (or a subexpression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects. […]

and [intro.execution]/10

Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. […] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. […]

and finally [expr.ass]/1

[…] In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
The right operand is sequenced before the left operand. […]

Based on this, I would conclude that in

while (*tgt++ = *src++);

the evaluation of *src is sequenced before the evaluation of *tgt while the side-effects of each increment as well as the assignment are all unsequenced with respect to each other. Since the condition in a while loop is a full-expression, all evaluations and side effects occurring in one iteration of the loop are sequenced before the evaluations and side effects of the next iteration.

As far as I can see, in C++ 11, the evaluations of *src and *tgt were unsequenced with respect to each other but sequenced before the side effect of assignment. The side effects of the increments and the assignment were also unsequenced with respect to each other.

What is the proper evaluation order when assigning a value in a map?

The evaluation order of A = B was not specified before c++17, after c++17 B is guaranteed to be evaluated before A, see https://en.cppreference.com/w/cpp/language/eval_order rule 20.

The behaviour of valMap[val] = valMap.size(); is therefore unspecified in c++14, you should use:

auto size = valMap.size();
valMap[val] = size;

Or avoid the problem by using emplace which is more explicit than relying on [] to automatically insert a value if it doesn't already exist:

valMap.emplace(val, size);

order of evaluation in C with assigning a variable with a function that changes a variable in the same assignment

In both these expressions

sum1 = (i / 2) + fun(&i);
sum2 = fun(&j) + (j / 2);

you have unspecified behavior, because the order of evaluation of the subexpressions of + is unspecified. The compiler is free to evaluate (i/2) and (j/2) before or after the call, as it sees fit.

Unlike && and || operators that force evaluation of their left side before their right side, +, -, *, /, %, and all the bitwise operators allow the compiler to pick the most convenient order of evaluation, depending on its optimization strategy. This means that different compilers may decide to evaluate the sides of + differently.

What this means from the practical perspective is that you should use expressions with side effects no more than once in the same expression. You can rewrite both your expressions to force the order of evaluation that you want, for example

sum1 = i/2;
sum1 += fun(&i);

or

sum2 = sun(&j);
sum2 += j/2;

Order of Evaluation in C Operators

The result of a post-increment expression is the value of the operand before it is incremented. Thus, even though the ++ in *p++ does, indeed, have higher precedence than the *, the latter is applied to the result of the p++ expression which is, as just mentioned, the initial value of p.

Is the order of evaluation with comma operator & assignment in C predictable?

The order is defined, because there is a sequence point between them. See ISO/IEC 9899 6.5.17:

The left operand of a comma operator is evaluated as a void
expression; there is a sequence point after its evaluation. Then
the right operand is evaluated; the result has its type and value. 95)
If an attempt is made to modify the result of a comma operator or to
access it after the next sequence point, the behavior is undefined.

They then give an explicit example:

In the function call

f(a, (t=3, t+2), c)
the function has three
arguments, the second of which has the value 5.

I'm not entirely sure why CppCheck is flagging it.

Precedence of Evaluation and Assignment operators in printf parameters

printf("%d %dn", k==35, k=50);

It's unspecified whether k=50 or k==35 is evaluated first, so this code has undefined behavior.

What are the evaluation order guarantees introduced by C++17?

Some common cases where the evaluation order has so far been unspecified, are specified and valid with C++17. Some undefined behaviour is now instead unspecified.

i = 1;
f(i++, i)

was undefined, but it is now unspecified. Specifically, what is not specified is the order in which each argument to f is evaluated relative to the others. i++ might be evaluated before i, or vice-versa. Indeed, it might evaluate a second call in a different order, despite being under the same compiler.

However, the evaluation of each argument is required to execute completely, with all side-effects, before the execution of any other argument. So you might get f(1, 1) (second argument evaluated first) or f(1, 2) (first argument evaluated first). But you will never get f(2, 2) or anything else of that nature.

std::cout << f() << f() << f();

was unspecified, but it will become compatible with operator precedence so that the first evaluation of f will come first in the stream (examples below).

f(g(), h(), j());

still has unspecified evaluation order of g, h, and j. Note that for getf()(g(),h(),j()), the rules state that getf() will be evaluated before g, h, j.

Also note the following example from the proposal text:

 std::string s = "but I have heard it works even if you don't believe in it"
s.replace(0, 4, "").replace(s.find("even"), 4, "only")
.replace(s.find(" don't"), 6, "");

The example comes from The C++ Programming Language, 4th edition, Stroustrup, and used to be unspecified behaviour, but with C++17 it will work as expected. There were similar issues with resumable functions (.then( . . . )).

As another example, consider the following:

#include <iostream>
#include <string>
#include <vector>
#include <cassert>

struct Speaker{
int i =0;
Speaker(std::vector<std::string> words) :words(words) {}
std::vector<std::string> words;
std::string operator()(){
assert(words.size()>0);
if(i==words.size()) i=0;
// Pre-C++17 version:
auto word = words[i] + (i+1==words.size()?"\n":",");
++i;
return word;
// Still not possible with C++17:
// return words[i++] + (i==words.size()?"\n":",");

}
};

int main() {
auto spk = Speaker{{"All", "Work", "and", "no", "play"}};
std::cout << spk() << spk() << spk() << spk() << spk() ;
}

With C++14 and before we may (and will) get results such as

play
no,and,Work,All,

instead of

All,work,and,no,play

Note that the above is in effect the same as

(((((std::cout << spk()) << spk()) << spk()) << spk()) << spk()) ;

But still, before C++17 there was no guarantee that the first calls would come first into the stream.

References: From the accepted proposal:

Postfix expressions are evaluated from left to right. This includes
functions calls and member selection expressions.

Assignment expressions are evaluated from right to left. This
includes compound assignments.

Operands to shift operators are evaluated from left to right. In
summary, the following expressions are evaluated in the order a, then
b, then c, then d:

  1. a.b
  2. a->b
  3. a->*b
  4. a(b1, b2, b3)
  5. b @= a
  6. a[b]
  7. a << b
  8. a >> b

Furthermore, we suggest the following additional rule: the order of
evaluation of an expression involving an overloaded operator is
determined by the order associated with the corresponding built-in
operator, not the rules for function calls.

Edit note: My original answer misinterpreted a(b1, b2, b3). The order of b1, b2, b3 is still unspecified. (thank you @KABoissonneault, all commenters.)

However, (as @Yakk points out) and this is important: Even when b1, b2, b3 are non-trivial expressions, each of them are completely evaluated and tied to the respective function parameter before the other ones are started to be evaluated. The standard states this like this:

§5.2.2 - Function call 5.2.2.4:

. . .
The postfix-expression is sequenced before each expression in the
expression-list and any default argument. Every value computation and
side effect associated with the initialization of a parameter, and the
initialization itself, is sequenced before every value computation and
side effect associated with the initialization of any subsequent
parameter.

However, one of these new sentences are missing from the GitHub draft:

Every value computation and side effect associated with the
initialization of a parameter, and the initialization itself, is
sequenced before every value computation and side effect associated
with the initialization of any subsequent parameter.

The example is there. It solves a decades-old problems (as explained by Herb Sutter) with exception safety where things like

f(std::unique_ptr<A> a, std::unique_ptr<B> b);

f(get_raw_a(), get_raw_a());

would leak if one of the calls get_raw_a() would throw before the other
raw pointer was tied to its smart pointer parameter.

As pointed out by T.C., the example is flawed since unique_ptr construction from raw pointer is explicit, preventing this from compiling.*

Also note this classical question (tagged C, not C++):

int x=0;
x++ + ++x;

is still undefined.



Related Topics



Leave a reply



Submit