Why Does the Expression a = a + B - ( B = a ) Give a Sequence Point Warning in C++

Why does the expression a = a + b - ( b = a ) give a sequence point warning in c++?

There is a difference between an expression being evaluated and completing its side effects.

The b = a assignment expression will be evaluated ahead of subtraction due to higher precedence of the parentheses. It will provide the value of a as the result of the evaluation. The writing of that value into b, however, may not complete until the next sequence point, which in this case is the end of the full expression. The end result of the overall expression is therefore undefined, because the subtraction may take the value of b before or after the assignment.

Why is a = (a+b) - (b=a) a bad choice for swapping two integers?

No. This is not acceptable. This code invokes Undefined behavior. This is because of the operation on b is not defined. In the expression

a=(a+b)-(b=a);  

it is not certain whether b gets modified first or its value gets used in the expression (a+b) because of the lack of the sequence point.

See what standard syas:

C11: 6.5 Expressions:

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar
object
, the behavior is undefined. If there are multiple allowable orderings of the
subexpressions of an expression, the behavior is undefined if such an unsequenced side
effect occurs in any of the orderings.84)
1.

Read C-faq- 3.8 and this answer for more detailed explanation of sequence point and undefined behavior.


1. Emphasis is mine.

Why does A = A + B - (B = A) swap values in C?

Actually it invokes undefined behaviour according to standards-

C99 §6.5: “2. Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.”

Assignment and sequence points: how is this ambiguous?

The rules of undefined behavior for sequence point violations do not make an exception for situations when "the value cannot change". Nobody cares whether the value changes or not. What matters is that when you are making any sort of write access to the variable, you are modifying that variable. Even if you are assigning the variable a value that it already holds, you are still performing a modification of that variable. And if multiple modifications are not separated by sequence points, the behavior is undefined.

One can probably argue that such "non-modifying modifications" should not cause any problems. But the language specification does not concern itself with such details. In language terminology, again, every time you are writing something into a variable, you are modifying it.

Moreover, the fact that you use the word "ambiguous" in your question seems to imply that you believe the behavior is unspecified. I.e. as in "the resultant value of the variable is (or isn't) ambiguous". However, in sequence point violations the language specification does not restrict itself to stating that the result is unspecified. It goes much further and declares the behavior undefined. This means that the rationale behind these rules takes into consideration more than just an unpredictable final value of some variable. For example, on some imaginary hardware platform non-sequenced modification might result in invalid code being generated by the compiler, or something like that.

Why are there no sequence point rule violations in this code?

Per the ISO C11 Standard, there exists a sequence point between the evaluation of the ternary/conditional operator's predicate and either of its alternatives, whichever one is evaluated. Therefore, the two ++a are sequenced with respect to each other by the intervening sequence point.

§6.5.15 Conditional operator


Syntax


1.

  conditional-expression:
logical-OR-expression
logical-OR-expression ? expression : conditional-expression

[...]

Semantics



  1. The first operand is evaluated; there is a sequence point between its evaluation and the evaluation of the second or third operand (whichever is evaluated). The second operand is evaluated only if the first compares unequal to 0; the third operand is evaluated only if the first compares equal to 0; the result is the value of the second or third operand (whichever is evaluated), converted to the type described below.

Undefined behavior and sequence points

C++98 and C++03

This answer is for the older versions of the C++ standard. The C++11 and C++14 versions of the standard do not formally contain 'sequence points'; operations are 'sequenced before' or 'unsequenced' or 'indeterminately sequenced' instead. The net effect is essentially the same, but the terminology is different.


Disclaimer : Okay. This answer is a bit long. So have patience while reading it. If you already know these things, reading them again won't make you crazy.

Pre-requisites : An elementary knowledge of C++ Standard



What are Sequence Points?

The Standard says

At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations
shall be complete and no side effects of subsequent evaluations shall have taken place. (§1.9/7)

Side effects? What are side effects?

Evaluation of an expression produces something and if in addition there is a change in the state of the execution environment it is said that the expression (its evaluation) has some side effect(s).

For example:

int x = y++; //where y is also an int

In addition to the initialization operation the value of y gets changed due to the side effect of ++ operator.

So far so good. Moving on to sequence points. An alternation definition of seq-points given by the comp.lang.c author Steve Summit:

Sequence point is a point in time at which the dust has settled and all side effects which have been seen so far are guaranteed to be complete.



What are the common sequence points listed in the C++ Standard?

Those are:

  • at the end of the evaluation of full expression (§1.9/16) (A full-expression is an expression that is not a subexpression of another expression.)1

    Example :

    int a = 5; // ; is a sequence point here
  • in the evaluation of each of the following expressions after the evaluation of the first expression (§1.9/18) 2

    • a && b (§5.14)
    • a || b (§5.15)
    • a ? b : c (§5.16)
    • a , b (§5.18) (here a , b is a comma operator; in func(a,a++) , is not a comma operator, it's merely a separator between the arguments a and a++. Thus the behaviour is undefined in that case (if a is considered to be a primitive type))
  • at a function call (whether or not the function is inline), after the evaluation of all function arguments (if any) which
    takes place before execution of any expressions or statements in the function body (§1.9/17).

1 : Note : the evaluation of a full-expression can include the evaluation of subexpressions that are not lexically
part of the full-expression. For example, subexpressions involved in evaluating default argument expressions (8.3.6) are considered to be created in the expression that calls the function, not the expression that defines the default argument

2 : The operators indicated are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation and the operands form an argument list, without an implied sequence point between them.



What is Undefined Behaviour?

The Standard defines Undefined Behaviour in Section §1.3.12 as

behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements 3.

Undefined behavior may also be expected when this
International Standard omits the description of any explicit definition of behavior.

3 : permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or with-
out the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

In short, undefined behaviour means anything can happen from daemons flying out of your nose to your girlfriend getting pregnant.



What is the relation between Undefined Behaviour and Sequence Points?

Before I get into that you must know the difference(s) between Undefined Behaviour, Unspecified Behaviour and Implementation Defined Behaviour.

You must also know that the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified.

For example:

int x = 5, y = 6;

int z = x++ + y++; //it is unspecified whether x++ or y++ will be evaluated first.

Another example here.


Now the Standard in §5/4 says


    1. Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.

What does it mean?

Informally it means that between two sequence points a variable must not be modified more than once.
In an expression statement, the next sequence point is usually at the terminating semicolon, and the previous sequence point is at the end of the previous statement. An expression may also contain intermediate sequence points.

From the above sentence the following expressions invoke Undefined Behaviour:

i++ * ++i;   // UB, i is modified more than once btw two SPs
i = ++i; // UB, same as above
++i = 2; // UB, same as above
i = ++i + 1; // UB, same as above
++++++i; // UB, parsed as (++(++(++i)))

i = (i, ++i, ++i); // UB, there's no SP between `++i` (right most) and assignment to `i` (`i` is modified more than once btw two SPs)

But the following expressions are fine:

i = (i, ++i, 1) + 1; // well defined (AFAIK)
i = (++i, i++, i); // well defined
int j = i;
j = (++i, i++, j*i); // well defined




    1. Furthermore, the prior value shall be accessed only to determine the value to be stored.

What does it mean? It means if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written.

For example in i = i + 1 all the access of i (in L.H.S and in R.H.S) are directly involved in computation of the value to be written. So it is fine.

This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.

Example 1:

std::printf("%d %d", i,++i); // invokes Undefined Behaviour because of Rule no 2

Example 2:

a[i] = i++ // or a[++i] = i or a[i++] = ++i etc

is disallowed because one of the accesses of i (the one in a[i]) has nothing to do with the value which ends up being stored in i (which happens over in i++), and so there's no good way to define--either for our understanding or the compiler's--whether the access should take place before or after the incremented value is stored. So the behaviour is undefined.

Example 3 :

int x = i + i++ ;// Similar to above

Follow up answer for C++11 here.

Where do sequence points come from?

Basically there is a C++03 sequence point between each statement. For more information see the SO C++ FAQ. For even more information do consult the C++ standard, and keep in mind that in the C++11 standard sequence points were replaced with sequenced before and sequenced after relations.

To avoid problems, simply don't try to be too clever about doing a lot in each expression.

Don't try to do the compiler's job: leave that to the compiler. Your job is to write code that other humans can easily understand, i.e. clear code. Multiple updates and needless use of operators with side effects is not compatible with that.

Tip: sprinkle const just about everywhere possible.

That constrains the possible state changes that a reader must take into account.

sequence-point warning using pack expansion when assigning to tuple

The warning is wrong, see rule (10) here:

In list-initialization, every value computation and side effect of a given initializer clause is sequenced before every value computation and side effect associated with any initializer clause that follows it in the brace-enclosed comma-separated list of initalizers.

The rule should equally affect brace initiailization of tuples and vectors.


Note that this is not a fold expression, just a pack expansion.

An example about operator precedences

The fact that || is short-circuiting and the lowest precedence operator explains the result. Because ++ is the higher precedence than && and && is higher precedence than ||, the expression tree for ++a || ++b && ++c is:

       ||   -- left:   ++a
-- right: && --left: ++b
--right: ++c

So, to evaluate the expression, C first considers the rules for evaluating ||, given by 6.5.14 in the C11 standard. Specifically:

6.5.14.4: Unlike the bitwise | operator, the || operator guarantees left-to-right evaluation; if the second operand is evaluated, there is a sequence point between the evaluations of the first and second operands. If the first operand compares unequal to 0, the second operand is not evaluated.

The fact that || is short-circuiting means that it evaluates its left operand first and only evaluates its right operand if the left is zero. So to evaluate the expression ++a || ++b && ++c, C requires the following:

  1. Evaluate ++a (a is incremented by one and the expression is equal to its incremented value). If it is non-zero, then || expression is equal to 1 and the right side is never evaluated.
  2. Otherwise, evaluate ++b and ++c, in left-to-right order. If ++b is zero, then ++c is never evaluated. If both are non-zero, then the expression is equal to 1.

Because ++a evaluates to 1, the right side of the || is never evaluated. This is why you have a==1, b==0, and c==1.

There's one additional problem with the statement c = ++a || ++b && ++c, which is that there's a potential for the statement to invoke undefined behavior. If ++a is false and ++b is true, then ++c has to be evaluated. However, there is no sequence point between c = ... and ++c. Since the expressions both modify c with no sequence point in between, the behavior is undefined. For a further explanation of this see, for example, https://stackoverflow.com/a/3575375/1430833

Why I got operation may be undefined in Statement Expression in C++?

According to the C++ grammar, expressions (apart from lambda expressions perhaps, but that's a different story) cannot contain statements - including block statements. Therefore, I would say your code is ill-formed, and if GCC compiles it, it means this is a (strange) compiler extension.

You should consult the compiler's reference to figure out what semantics it is given (or not given, as the error message seems to suggest) to it.

EDIT:

As pointed out by Shafik Yaghmour in the comments, this appears to be a GNU extension. According to the documentation, the value of this "statement expression" is supposed to be the value of the last statement in the block, which should be an expression statement:

The last thing in the compound statement should be an expression followed by a semicolon; the value of this subexpression serves as the value of the entire construct. (If you use some other kind of statement last within the braces, the construct has type void, and thus effectively no value.)

Since the block in your example does not contain an expression statement as the last statement, GCC does not know how to evaluate that "statement expression" (not to be confused with "expression statement" - that's what should appear last in a statement expression).

To prevent GCC from complaining, therefore, you should do something like:

({if (a) a=0; a;});
// ^^

But honestly, I do not understand why one would ever need this thing in C++.



Related Topics



Leave a reply



Submit