Is Uninitialized Data Behavior Well Specified

Is uninitialized data behavior well specified?

So is this just a case of 'uninitialized data leads to unspecified behavior'

Yes...

Sometimes if you call malloc (or new, which calls malloc) you will get data that is filled with zeroes because it is in a fresh page from the kernel. Other times it will be full of junk. If you put something on the stack (i.e., auto storage), you will almost certainly get garbage — but it can be hard to debug, because on your system that garbage might happen to be somewhat predictable. And with objects on the stack, you'll find that changing code in a completely different source file can change the values you see in an uninitialized data structure.

About POD: Whether or not something is POD is really a red herring here. I only explained it because the question mentioned POD, and the conversation derailed from there. The two relevant concepts are storage duration and constructors. POD objects don't have constructors, but not everything without a constructor is POD. (Technically, POD objects don't have non-trivial constructors nor members with non-trivial constructors.)

Storage duration: There are three kinds. Static duration is for globals, automatic is for local variables, and dynamic is for objects on the heap. (This is a simplification and not exactly correct, but you can read the C++ standard yourself if you need something exactly correct.)

Anything with static storage duration gets initialized to zero. So if you make a global instance of BaseClass, then its pub member will be zero (at first). Since you put it on the stack and the heap, this rule does not apply — and you don't do anything else to initialize it, so it is uninitialized. It happens to contain whatever junk was left in memory by the last piece of code to use it.

As a rule, any POD on the heap or the stack will be uninitialized unless you initialize it yourself, and the value will be undefined, possible changing when you recompile or run the program again. As a rule, any global POD will get initialized to zero unless you initialize it to something else.

Detecting uninitialized values: Try using Valgrind's memcheck tool, it will help you find where you use uninitialized values — these are usually errors.

What are the dangers of uninitialised variables?

These variables could contain any value if you don't initialize them and reading them in an uninitialized stated is undefined behavior. (except if they are zero initalized)

And if you forgot to initialize one of them, and reading from it by accident results in the value you expect it should have on your current system configuration (due to undefined behavior), then your program might behave unpredictable/unexpected after a system update, on a different system or when you do changes in your code.

And these kinds of errors are hard to debug. So even if you set them at runtime it is suggested to initialize them to known values so that you have a controlled environment with predictable behavior.

There are a few exceptions, e.g. if you set the variable right after you declared it and you can't set it directly, like if you set its value using a streaming operator.

What happens to uninitialized variables in C/C++?

Q.1) What happens if an uninitialized variable is used in say an operation? Will it crash/ will the code fail to compile?

Many compilers try to warn you about code that improperly uses the value of an uninitialized variable. Many compilers have an option that says "treat warnings as errors". So depending on the compiler you're using and the option flags you invoke it with and how obvious it is that a variable is uninitialized, the code might fail to compile, although we can't say that it will fail to compile.

If the code does compile, and you try to run it, it's obviously impossible to predict what will happen. In most cases the variable will start out containing an "indeterminate" value. Whether that indeterminate value will cause your program to work correctly, or work incorrectly, or crash, is anyone's guess. If the variable is an integer and you try to do some math on it, you'll probably just get a weird answer. But if the variable is a pointer and you try to indirect on it, you're quite likely to get a crash.

It's often said that uninitialized local variables start out containing "random garbage", but that can be misleading, as evidenced by the number of people who post questions here pointing out that, in their program where they tried it, the value wasn't random, but was always 0 or was always the same. So I like to say that uninitialized local variables never start out holding what you expect. If you expected them to be random, you'll find that (at least on any given day) they're repeatable and predictable. But if you expect them to be predictable (and, god help you, if you write code that depends on it), then by jingo, you'll find that they're quite random.

Whether use of an uninitialized variable makes your program formally undefined turns out to be a complicated question. But you might as well assume that it does, because it's a case you want to avoid just as assiduously as you avoid any other dangerous, imperfectly-defined behavior.

See this old question and this other old question for more (much more!) information on the fine distinctions between undefined and indeterminate behavior in this case.

Q.2) Will C and C++ standards differ in how they treat an uninitialized variable?

They might differ. As I alluded to above, and at least in C, it turns out that not all uses of uninitialized local variables are formally undefined. (Some are merely "indeterminate".) But the passages quoted from the C++ standards by other answers here make it sound like it's undefined there all the time. Again, for practical purposes, the question probably doesn't matter, because as I said, you'll want to avoid it no matter what.

Q.3) Regarding similar queries, how and where can I find an 'official' answer? Is it practical for an amateur to look up the C and C++ standards?

It is not always easy to obtain copies of the standards (let alone official ones, which often cost money), and the standards can be difficult to read and to properly interpret, but yes, given effort, anyone can obtain, read, and attempt to answer questions using the standards. You might not always make the correct interpretation the first time (and you may therefore need to ask for help), but I wouldn't say that's a reason not to try. (For one thing, anyone can read any document and end up not making the correct interpretation the first time; this phenomenon is not limited to amateur programmers reading complex language standards documents!)

What will be the value of uninitialized variable?

Technically, the value of an uninitialized non static local variable is Indeterminate[Ref 1].

In short it can be anything. Accessing such a uninitialized variable leads to an Undefined Behavior.[Ref 2]

[Ref 1]
C99 section 6.7.8 Initialization:

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.

[Ref 2]

C99 section 3.18 Undefined behavior:

behavior, upon use of a nonportable or erroneous program construct, of erroneous data, or of indeterminately valued objects, for which this International Standard imposes no requirements.

Note: Emphasis mine.

What happens to a declared, uninitialized variable in C? Does it have a value?

Static variables (file scope and function static) are initialized to zero:

int x; // zero
int y = 0; // also zero

void foo() {
static int x; // also zero
}

Non-static variables (local variables) are indeterminate. Reading them prior to assigning a value results in undefined behavior.

void foo() {
int x;
printf("%d", x); // the compiler is free to crash here
}

In practice, they tend to just have some nonsensical value in there initially - some compilers may even put in specific, fixed values to make it obvious when looking in a debugger - but strictly speaking, the compiler is free to do anything from crashing to summoning demons through your nasal passages.

As for why it's undefined behavior instead of simply "undefined/arbitrary value", there are a number of CPU architectures that have additional flag bits in their representation for various types. A modern example would be the Itanium, which has a "Not a Thing" bit in its registers; of course, the C standard drafters were considering some older architectures.

Attempting to work with a value with these flag bits set can result in a CPU exception in an operation that really shouldn't fail (eg, integer addition, or assigning to another variable). And if you go and leave a variable uninitialized, the compiler might pick up some random garbage with these flag bits set - meaning touching that uninitialized variable may be deadly.

Any guarantees for uninitialised variables?

In C all of them are undefined behavior, but for a reason that probably not comes directly to mind. Accessing an object with indeterminate value has undefined behavior if it is
"memoryless" that is 6.3.2.1 p2

If the lvalue designates an object of automatic storage duration that
could have been declared with the register storage class (never had
its address taken), and that object is uninitialized (not declared
with an initializer and no assignment to it has been performed prior
to use), the behavior is undefined.

Otherwise, if the address is taken, the interpretation of what indeterminate means concretely in this case is not unanimous. There are people that expect such a value to be fixed once it is first read, others speak of something like "woobly" (or so) values that can be different at each access.

In summary, don't do it. (But that you probably knew already.)

(And not talking about the error using "%d" for an unsigned.)

(Why) is using an uninitialized variable undefined behavior?

Yes this behavior is undefined but for different reasons than most people are aware of.

First, using an unitialized value is by itself not undefined behavior, but the value is simply indeterminate. Accessing this then is UB if the value happens to be a trap representation for the type. Unsigned types rarely have trap representations, so you would be relatively safe on that side.

What makes the behavior undefined is an additional property of your variable, namely that it "could have been declared with register" that is its address is never taken. Such variables are treated specially because there are architectures that have real CPU registers that have a sort of extra state that is "uninitialized" and that doesn't correspond to a value in the type domain.

Edit: The relevant phrase of the standard is 6.3.2.1p2:

If the lvalue designates an object of automatic storage duration that
could have been declared with the register storage class (never had
its address taken), and that object is uninitialized (not declared
with an initializer and no assignment to it has been performed prior
to use), the behavior is undefined.

And to make it clearer, the following code is legal under all circumstances:

unsigned char a, b;
memcpy(&a, &b, 1);
a -= a;
  • Here the addresses of a and b are taken, so their value is just
    indeterminate.
  • Since unsigned char never has trap representations
    that indeterminate value is just unspecified, any value of unsigned char could
    happen.
  • At the end a must hold the value 0.

Edit2: a and b have unspecified values:

3.19.3 unspecified value

valid value of the relevant type where this International Standard imposes no requirements on which value
is chosen in any instance

Edit3: Some of this will be clarified in C23, where the term "indeterminate value" is replaced by the term "indeterminate representation" and the term "trap representation" is replaced by "non-value representation". Note also that all of this is different between C and C++, which has a different object model.

What happens to uninitialized class members in c++?

What happens to uninitialized class members in c++?

They are default initialized, which in case of fundamental types like int means that it will have an indeterminate value.

what should I take care of?

You should take care to never read an indeterminate value.

Is this C++ member initialization behavior well defined?

Yes, you are right that it is UB, but for different reasons than just storing a reference to an object that hasn't been constructed.

Construction of class members happens in order of their appearance in the class. Although the address of B is not going to change and technically you can store a reference to it, as @StoryTeller pointed out, calling b.printMember() in the constructor with b that hasn't been constructed yet is definitely UB.



Related Topics



Leave a reply



Submit