Is Reading an Indeterminate Value Undefined Behavior

Is reading an indeterminate value undefined behavior?

yes, formally an rvalue conversion of indeterminate value is UB (except for unsigned char, originally i wrote "and variants" but as i recall the formal caters to 1's complement signed char where possibly minus 0 could be used as trap value)

i'm too lazy to do the standard paragraph lookup for you, and also to lazy to care about downvotes for that

however, in practice only a problem on (1) archaic architectures, and perhaps (2) 64-bit systems.

EDIT: oops, i now seem to recall a blog posting and associated Defect Report about formal UB for accessing indeterminate char. so perhaps i'll have to actually check the standard, + search DRs. argh, it will have to be later then, now coffee!

EDIT2: Johannes Schaub was kind enough to provide this link to SO question where that UB for accessing char was discussed. So, that's where I remembered it from! Thanks, Johannes.

cheers & hth.,

Is reading values of unitialized object yields Undefined Behavior

Where does this reasoning fail?

One failure is here

Since the representation of int is never a trap

int can have trap representations.

The only type that can't have trap representations is unsigned char

But there is also this part in the standard describing undefined behavior (from draft n1570):

J.2 Undefined behavior

...

An lvalue designating an object of automatic storage duration that could have been
declared with the register storage class is used in a context that requires the value
of the designated object, but the object is uninitialized. (6.3.2.1).

What is Indeterminate value?

The differentiation of the two (indeterminate values and trap representations) is fundamental. In one case you have no known value. In the other you have a known-invalid value.

Simplest example of an indeterminate value I can muster:

int a;
int b = a;

There is no concept of determinate 'value' associated with a. It has something (as it is occupying memory) but the "what" it has is not defined, thus indeterminate. Overall, the concept is as simple as it sounds: Unless it has been decided what something is, it cannot be used in any evaluation (think r-value if it helps) with deterministic results.

The actual value depends on the language, compiler, and memory management policies. For instance, in most implementations of C, an uninitialized scope variable or the memory pointed to by the pointer returned by a call to malloc will contain whatever value happened to be stored at that address previously. On the other hand, most scripting languages will initialize variables to some default value (0, "", etc).

Regarding Trap Representation, it is essentially any value that is outside the restricted domain of the allowable values pertaining to the underlying formal definition. A hopefully non-confusing example follows. :

enum FooBar { foo=0, bar=1 };
enum FooBar fb = (enum FooBar)2;

In general it is any bit pattern that falls within the space allowed by the underlying storage representation (in enums that is likely an int) but is NOT considered a valid "value" for the restricted domain of its formal definition. An outstanding description on trap representations and their roots can be found at this answer. The above is just a representative of what a very simple known-invalid representation may appear as. In reality it is practiced in hardware for detection of values that trigger invalid-state. I think of them as "panic" values. Again, the above code is solely idealistic in demonstrating the concept of a "value" this is not "valid", but is, in fact, known.

What is indeterminate behavior in C++ ? How is it different from undefined behavior?

EDIT 1: The last drafts of C11 and C++11 are available online here: C11 draft N1570 and C++11 draft n3242 if you don't have a copy of the final standards and wonderful what they look like. (Other adjustments to text appearance and some wording/grammar edits have been done.)

EDIT 2: Fixed all occurrences of "behaviour" to be "behavior" to match the standard.

Searching the C++11 and C11 standards there are no matches for indeterminate rule or undefined rule. There are terms like indeterminate value, indeterminately sequenced, indeterminate uninitialized, etc.

If talk of traps and exceptions seems weird in Norman Gray's answer, know that those terms do reflect the relevant definitions in Section 3 in the C11 standard.

C++ relies on C's definitions. Many useful definitions concerning types of behaviour can be found in C11's Section 3 (in C11). For example, indeterminate value is defined in 3.19.2. Do take note that C11's Section 2 (Normative References) provides other sources for additional terminology interpretation and Section 4 defines when cases such as undefined behavior occur as a result of not complying with the standard.

C11's section 3.4 defines behavior, 3.4.1 defines implementation-defined behavior, 3.4.2 defines locale-specific behavior, 3.4.3 defines undefined behavior, 3.4.4 defines unspecified behavior. For value (Section 3.19), there are implementation-defined value, indeterminate value, and unspecified value.

Loosely speaking, the term indeterminate refers to an unspecified/unknown state that by itself doesn't result in undefined behavior. For example, this C++ code involves an indeterminate value: { int x = x; }. (This is actually an example in the C++11 standard.) Here x is defined to be an integer first but at that point it does not have a well-defined value --then it is initialized to whatever (indeterminate/unknown) value it has!

The well-known term undefined behavior is defined in 3.4.3 in C11 and refers to any situation of a

nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

In other words undefined behavior is some error (in logic or state) and whatever happens next is unknown! So one could make an undefined [behavior] rule that states: avoid undefined behavior when writing C/C++ code! :-)

An indeterminate [behavior] rule would be to state: avoid writing indeterminate code unless it is needed and it does not affect program correctness or portability. So unlike undefined behavior, indeterminate behavior does not necessarily imply that code/data is erroneous, however, its subsequent use may or may not be erroneous --so care is required to ensure program correctness is maintained.

Other terms like indeterminately sequenced are in the body text (e.g., C11 5.1.2.3 para 3; C++11, section 1.9 para. 13; i.e., in [intro.executation]). (As you might guess, it refers an unspecified order of operational steps.)

IMO if one is interested in all of these nuances, acquiring both the C++11 and C11 standards is a must. This will permit one to explore to the desired level-of-detail needed with definitions, etc. If you don't have such the links provided herein will help you explore such with the last published draft C11 and C++11 standards.

Is reading an uninitialized value always an undefined behaviour? Or are there exceptions to it?

Each of the three lines triggers undefined behavior. The key part of the C standard, that explains this, is section 6.3.2.1p2 regarding Conversions:

Except when it is the operand of the sizeof operator, the
_Alignof operator, the unary & operator, the ++ operator, the
-- operator, or the left operand of the . operator or an
assignment operator, an lvalue that does not have array type
is converted to the value stored in the designated object
(and is no longer an lvalue); this is called lvalue
conversion
. If the lvalue has qualified type, the value has
the unqualified version of the type of the lvalue; additionally,
if the lvalue has atomic type, the value has the non-atomic version
of the type of the lvalue; otherwise, the value has the
type of the lvalue. If the lvalue has an incomplete type and does
not have array type, the behavior is undefined. If the lvalue
designates an object of automatic storage duration that could
have been declared with the register storage class (never had its
address taken), and that object is uninitialized (not declared
with an initializer and no assignment to it has been
performed prior to use), the behavior is undefined.

In each of the three cases, an uninitialized variable is used as the right-hand side of an assignment or initialization (which for this purpose is equivalent to an assignment) and undergoes lvalue to rvalue conversion. The part in bold applies here as the objects in question have not been initialized.

This also applies to the int i = i; case as the lvalue on the right side has not (yet) been initialized.

There was debate in a related question that the right side of int i = i; is UB because the lifetime of i has not yet begun. However, that is not the case. From section 6.2.4 p5 and p6:

5 An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic
storage duration
, as do some compound literals. The result of
attempting to indirectly access an object with automatic storage
duration from a thread other than the one with which the object is
associated is implementation-defined.

6 For such an object that does not have a variable length array type, its lifetime extends from entry into the block
with which it is associated until execution of that block ends in any
way.
(Entering an enclosed block or calling a function
suspends, but does not end,execution of the current block.) If
the block is entered recursively, a new instance of the object is
created each time. The initial value of the object is
indeterminate. If an initialization is specified for the
object, it is performed each time the declaration or compound
literal is reached in the execution of the block; otherwise,
the value becomes indeterminate each time the declaration is reached

So in this case the lifetime of i begins before the declaration in encountered. So int i = i; is still undefined behavior, but not for this reason.

The bolded part of 6.3.2.1p2 does however open the door for use of an uninitialized variable not being undefined behavior, and that is if the variable in question had it's address taken. For example:

int a;
printf("%p\n", (void *)&a);
printf("%d\n", a);

In this case it is not undefined behavior if:

  • The implementation does not have trap representations for the given type, OR
  • The value chosen for a happens to not be a trap representation.

In which case the value of a is unspecified. In particular, this will be the case with GCC and Microsoft Visual C++ (MSVC) in this example as these implementations do not have trap representations for integer types.

Has C++ standard changed with respect to the use of indeterminate values and undefined behavior in C++14?

Yes, this change was driven by changes in the language which makes it undefined behavior if an indeterminate value is produced by an evaluation but with some exceptions for unsigned narrow characters.

Defect report 1787 whose proposed text can be found in N39141 was recently accepted in 2014 and is incorporated in the latest working draft N3936:

The most interesting change with respect to indeterminate values would be to section 8.5 paragraph 12 which goes from:

If no initializer is specified for an object, the object is default-initialized; if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value. [ Note: Objects with static or thread storage duration are zero-initialized, see 3.6.2. — end note ]

to (emphasis mine):

If no initializer is specified for an object, the object is
default-initialized. When storage for an object with automatic or
dynamic storage duration is obtained, the object has an indeterminate
value
, and if no initialization is performed for the object, that
object retains an indeterminate value until that value is replaced
(5.17 [expr.ass]). [Note: Objects with static or thread storage
duration are zero-initialized, see 3.6.2 [basic.start.init]. —end
note] If an indeterminate value is produced by an evaluation, the
behavior is undefined except in the following cases
:

  • If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of:

    • the second or third operand of a conditional expression (5.16 [expr.cond]),

    • the right operand of a comma (5.18 [expr.comma]),

    • the operand of a cast or conversion to an unsigned narrow character type (4.7 [conv.integral], 5.2.3 [expr.type.conv], 5.2.9
      [expr.static.cast], 5.4 [expr.cast]), or

    • a discarded-value expression (Clause 5 [expr]),


    then the result of the operation is an indeterminate value.

  • If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the right
    operand of a simple assignment operator (5.17 [expr.ass]) whose first
    operand is an lvalue of unsigned narrow character type, an
    indeterminate value replaces the value of the object referred to by
    the left operand.

  • If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the
    initialization expression when initializing an object of unsigned
    narrow character type, that object is initialized to an indeterminate
    value.

and included the following example:

[ Example:

int f(bool b) {
unsigned char c;
unsigned char d = c; // OK, d has an indeterminate value
int e = d; // undefined behavior
return b ? d : 0; // undefined behavior if b is true
}

end example ]

We can find this text in N3936 which is the current working draft and N3937 is the C++14 DIS.

Prior to C++1y

It is interesting to note that prior to this draft unlike C which has always had a well specified notion of what uses of indeterminate values were undefined C++ used the term indeterminate value without even defining it (assuming we can not borrow definition from C99) and also see defect report 616. We had to rely on the underspecified lvalue-to-rvalue conversion which in draft C++11 standard is covered in section 4.1 Lvalue-to-rvalue conversion paragraph 1 which says:

[...]if the object is uninitialized, a program that necessitates this conversion has undefined behavior.[...]


Footnotes:

  1. 1787 is a revision of defect report 616, we can find that information in N3903

What happens to uninitialized variables in C/C++?

Q.1) What happens if an uninitialized variable is used in say an operation? Will it crash/ will the code fail to compile?

Many compilers try to warn you about code that improperly uses the value of an uninitialized variable. Many compilers have an option that says "treat warnings as errors". So depending on the compiler you're using and the option flags you invoke it with and how obvious it is that a variable is uninitialized, the code might fail to compile, although we can't say that it will fail to compile.

If the code does compile, and you try to run it, it's obviously impossible to predict what will happen. In most cases the variable will start out containing an "indeterminate" value. Whether that indeterminate value will cause your program to work correctly, or work incorrectly, or crash, is anyone's guess. If the variable is an integer and you try to do some math on it, you'll probably just get a weird answer. But if the variable is a pointer and you try to indirect on it, you're quite likely to get a crash.

It's often said that uninitialized local variables start out containing "random garbage", but that can be misleading, as evidenced by the number of people who post questions here pointing out that, in their program where they tried it, the value wasn't random, but was always 0 or was always the same. So I like to say that uninitialized local variables never start out holding what you expect. If you expected them to be random, you'll find that (at least on any given day) they're repeatable and predictable. But if you expect them to be predictable (and, god help you, if you write code that depends on it), then by jingo, you'll find that they're quite random.

Whether use of an uninitialized variable makes your program formally undefined turns out to be a complicated question. But you might as well assume that it does, because it's a case you want to avoid just as assiduously as you avoid any other dangerous, imperfectly-defined behavior.

See this old question and this other old question for more (much more!) information on the fine distinctions between undefined and indeterminate behavior in this case.

Q.2) Will C and C++ standards differ in how they treat an uninitialized variable?

They might differ. As I alluded to above, and at least in C, it turns out that not all uses of uninitialized local variables are formally undefined. (Some are merely "indeterminate".) But the passages quoted from the C++ standards by other answers here make it sound like it's undefined there all the time. Again, for practical purposes, the question probably doesn't matter, because as I said, you'll want to avoid it no matter what.

Q.3) Regarding similar queries, how and where can I find an 'official' answer? Is it practical for an amateur to look up the C and C++ standards?

It is not always easy to obtain copies of the standards (let alone official ones, which often cost money), and the standards can be difficult to read and to properly interpret, but yes, given effort, anyone can obtain, read, and attempt to answer questions using the standards. You might not always make the correct interpretation the first time (and you may therefore need to ask for help), but I wouldn't say that's a reason not to try. (For one thing, anyone can read any document and end up not making the correct interpretation the first time; this phenomenon is not limited to amateur programmers reading complex language standards documents!)



Related Topics



Leave a reply



Submit