Has C++ Standard Changed With Respect to the Use of Indeterminate Values and Undefined Behavior in C++14

Has C++ standard changed with respect to the use of indeterminate values and undefined behavior in C++14?

Yes, this change was driven by changes in the language which makes it undefined behavior if an indeterminate value is produced by an evaluation but with some exceptions for unsigned narrow characters.

Defect report 1787 whose proposed text can be found in N39141 was recently accepted in 2014 and is incorporated in the latest working draft N3936:

The most interesting change with respect to indeterminate values would be to section 8.5 paragraph 12 which goes from:

If no initializer is specified for an object, the object is default-initialized; if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value. [ Note: Objects with static or thread storage duration are zero-initialized, see 3.6.2. — end note ]

to (emphasis mine):

If no initializer is specified for an object, the object is
default-initialized. When storage for an object with automatic or
dynamic storage duration is obtained, the object has an indeterminate
value
, and if no initialization is performed for the object, that
object retains an indeterminate value until that value is replaced
(5.17 [expr.ass]). [Note: Objects with static or thread storage
duration are zero-initialized, see 3.6.2 [basic.start.init]. —end
note] If an indeterminate value is produced by an evaluation, the
behavior is undefined except in the following cases
:

  • If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of:

    • the second or third operand of a conditional expression (5.16 [expr.cond]),

    • the right operand of a comma (5.18 [expr.comma]),

    • the operand of a cast or conversion to an unsigned narrow character type (4.7 [conv.integral], 5.2.3 [expr.type.conv], 5.2.9
      [expr.static.cast], 5.4 [expr.cast]), or

    • a discarded-value expression (Clause 5 [expr]),


    then the result of the operation is an indeterminate value.

  • If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the right
    operand of a simple assignment operator (5.17 [expr.ass]) whose first
    operand is an lvalue of unsigned narrow character type, an
    indeterminate value replaces the value of the object referred to by
    the left operand.

  • If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the
    initialization expression when initializing an object of unsigned
    narrow character type, that object is initialized to an indeterminate
    value.

and included the following example:

[ Example:

int f(bool b) {
unsigned char c;
unsigned char d = c; // OK, d has an indeterminate value
int e = d; // undefined behavior
return b ? d : 0; // undefined behavior if b is true
}

end example ]

We can find this text in N3936 which is the current working draft and N3937 is the C++14 DIS.

Prior to C++1y

It is interesting to note that prior to this draft unlike C which has always had a well specified notion of what uses of indeterminate values were undefined C++ used the term indeterminate value without even defining it (assuming we can not borrow definition from C99) and also see defect report 616. We had to rely on the underspecified lvalue-to-rvalue conversion which in draft C++11 standard is covered in section 4.1 Lvalue-to-rvalue conversion paragraph 1 which says:

[...]if the object is uninitialized, a program that necessitates this conversion has undefined behavior.[...]


Footnotes:

  1. 1787 is a revision of defect report 616, we can find that information in N3903

Use of variable in own initializer

This was opened as an editorial issue. It was forwarded to CWG for (internal) discussion. Approximately 24 hours later, the person who forwarded the issue created a pull request which modifies the example to make it clear that this is UB:

Here, the initialization of the second \tcode{x} has undefined behavior, because the initializer accesses the second \tcode{x} outside its lifetime\iref{basic.life}.

That PR has since been added and the issue closed. So it seems clear that the obvious interpretation (UB due to accessing an object whose lifetime has not started) is the intended interpretation. It appears that the intent of the committee is to make these constructs non-functional, and the standard's non-normative text has been updated to reflect this.

The segmentation fault error that I cannot understand [duplicate]

Here's a step-by-step breakdown of what your code is doing:




int *a;
int *b;

This declares two pointers to int named a and b. It does not initialize them. That means that their values are unspecified, and you should expect them to be complete garbage. You can think of them as "wild" pointers at this moment, which is to say that they don't point to valid objects, and dereferencing them will cause Undefined Behavior and introduce a plethora of weird bugs, if not a simple crash.




int c=12;

This creates a simple local variable c of type int, which is initialized, with a value of 12. If you hadn't initialized it, as in int c; then it would also be full of garbage.




a=&c;

This snippet sets the pointer a to point to c, which is to say that the address of c is assigned to a. Now a is no longer uninitialized, and points to a well-defined location. After this, you can safely dereference a and be assured that there is a valid int at the other end.




*b=*b;

Here, you are dereferencing b, which means that you are reaching into your programs memory to grab whatever is pointed to by b. But b is uninitialized; it is garbage. What is it pointing to? Who knows? To read from the address it points to is like Russian roulette, you might kill your program immediately if you get really unlucky and the Operating System or runtime environment notices you doing something that's obviously wrong. But you also might get away with it, only for weird and unpredictable bugs to emerge later. This weirdness and unpredictability is why a good C++ programmer avoids Undefined Behavior at all costs, and ensures that variables are initialized before they are used, and makes sure that pointers are pointing to valid objects before dereferencing them.

Why does there appear to be a difference depending on a=&c;?

As to why your program apparently crashes or doesn't crash depending on how you initialize the other pointer, the answer is that it doesn't matter. In both cases, you're causing Undefined Behavior; you are breaking the language's rules and you should not expect the language to behave correctly for you thereafter, and all bets are off.

Is reading values of unitialized object yields Undefined Behavior [duplicate]

Where does this reasoning fail?

One failure is here

Since the representation of int is never a trap

int can have trap representations.

The only type that can't have trap representations is unsigned char

But there is also this part in the standard describing undefined behavior (from draft n1570):

J.2 Undefined behavior

...

An lvalue designating an object of automatic storage duration that could have been
declared with the register storage class is used in a context that requires the value
of the designated object, but the object is uninitialized. (6.3.2.1).

How can I get gcc to warn me about int i = i;

For GCC compiling C programs, you need to add the compiler flag -Winit-self. (You also need -Wall or -Wuninitialized, see below.) For GCC compiling C++ programs, this flag is implied by -Wall but for C it needs to specified explicitly; it is not part of -Wextra either.

For Clang, the situation is slightly more interesting. In the snippet in the OP, Clang does not produce any diagnostic. However, with the slightly different snippet supplied in the GCC manual below, a diagnostic is provided:

int f() {
int i = i;
return i;
}

The difference is that in the above snippet, the (uninitialized) value of i is actually used. Apparently, in the original code Clang detected that the variable was useless and eliminated it as dead code before applying the diagnostic.

In Clang, the diagnostic is triggered by -Wuninitialized, which is enabled by -Wall as in GCC.


Here's an excerpt from the GCC manual:

-Winit-self (C, C++, Objective-C and Objective-C++ only)

Warn about uninitialized variables that are initialized with themselves. Note this option can only be used with the -Wuninitialized option.

For example, GCC warns about i being uninitialized in the following snippet only when -Winit-self has been specified:

        int f()
{
int i = i;
return i;
}

This warning is enabled by -Wall in C++.

As the excerpt indicates, -Wuninitialized is also required. In both C and C++, -Wall implies -Wuninitialized. However, note that many uninitialized uses will not be detected unless some optimization level is also requested. (That doesn't apply to -Winit-self, as far as I know. It can be detected without optimization.)


Irritatingly, when you unmark a question as a duplicate, the previously-marked duplicates disappear. I unmarked it because none of the duplicates actually answered the question in the body; I also edited the title.

For reference, here are the original duplicates, which may be of interest:

  • Why does the compiler allow initializing a variable with itself?

  • gcc failing to warn of uninitialized variable

  • Why is this initialization accepted by the c++ compiler? static int x = x;

  • Has C++ standard changed with respect to the use of indeterminate values and undefined behavior in C++14?

Is reading an indeterminate value undefined behavior?

yes, formally an rvalue conversion of indeterminate value is UB (except for unsigned char, originally i wrote "and variants" but as i recall the formal caters to 1's complement signed char where possibly minus 0 could be used as trap value)

i'm too lazy to do the standard paragraph lookup for you, and also to lazy to care about downvotes for that

however, in practice only a problem on (1) archaic architectures, and perhaps (2) 64-bit systems.

EDIT: oops, i now seem to recall a blog posting and associated Defect Report about formal UB for accessing indeterminate char. so perhaps i'll have to actually check the standard, + search DRs. argh, it will have to be later then, now coffee!

EDIT2: Johannes Schaub was kind enough to provide this link to SO question where that UB for accessing char was discussed. So, that's where I remembered it from! Thanks, Johannes.

cheers & hth.,

Is it Undefined behavior to not having a return statement for a non-void function in which control can never off over the end?

The two statements are in no way contradictory.

The first statement is about what happens when control flow exits a non-void function without executing a return statement. The second statement is about what happens when control flow does not exit the function at all. Calls to functions like exit or std::terminate do not ever have control flow proceed past the point when those functions are called.

But that has nothing to do with the nature of the return value.

The behavior of the program when a non-void function runs out of stuff to do without an explicit return statement (or throw. Or co_return these days) is governed by [stmt.return]/2:

Flowing off the end of a function is equivalent to a return with no value; this results in undefined behavior in a value-returning function.

Is the following C union access pattern undefined behavior?

Defect report 283: Accessing a non-current union member ("type punning") covers this and tells us there is undefined behavior if there is trap representation.

The defect report asked:

In the paragraph corresponding to 6.5.2.3#5, C89 contained this
sentence:

With one exception, if a member of a union object is accessed after a value has been stored in a different member of the object, the
behavior is implementation-defined.


Associated with that sentence was this footnote:

The "byte orders" for scalar types are invisible to isolated programs that do not indulge in type punning (for example, by
assigning to one member of a union and inspecting the storage by
accessing another member that is an appropriately sixed array of
character type), but must be accounted for when conforming to
externally imposed storage layouts.


The only corresponding verbiage in C99 is 6.2.6.1#7:

When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that
member but do correspond to other members take unspecified values, but
the value of the union object shall not thereby become a trap
representation.


It is not perfectly clear that the C99 words have the same
implications as the C89 words.

The defect report added the following footnote:

Attach a new footnote 78a to the words "named member" in 6.5.2.3#3:

78a If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.

C11 6.2.6.1 General tells us:

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.50) Such a representation is called a trap representation.



Related Topics



Leave a reply



Submit