What's the Behavior of an Uninitialized Variable Used as Its Own Initializer

What's the behavior of an uninitialized variable used as its own initializer?

Because i is uninitialized when use to initialize itself, it has an indeterminate value at that time. An indeterminate value can be either an unspecified value or a trap representation.

If your implementation supports padding bits in integer types and if the indeterminate value in question happens to be a trap representation, then using it results in undefined behavior.

If your implementation does not have padding in integers, then the value is simply unspecified and there is no undefined behavior.

EDIT:

To elaborate further, the behavior can still be undefined if i never has its address taken at some point. This is detailed in section 6.3.2.1p2 of the C11 standard:

If the lvalue designates an object of automatic storage
duration that could have been declared with the register storage
class (never had its address taken), and that object is uninitialized
(not declared with an initializer and no assignment to it
has been performed prior to use), the behavior is undefined.

So if you never take the address of i, then you have undefined behavior. Otherwise, the statements above apply.

What happens to a declared, uninitialized variable in C? Does it have a value?

Static variables (file scope and function static) are initialized to zero:

int x; // zero
int y = 0; // also zero

void foo() {
static int x; // also zero
}

Non-static variables (local variables) are indeterminate. Reading them prior to assigning a value results in undefined behavior.

void foo() {
int x;
printf("%d", x); // the compiler is free to crash here
}

In practice, they tend to just have some nonsensical value in there initially - some compilers may even put in specific, fixed values to make it obvious when looking in a debugger - but strictly speaking, the compiler is free to do anything from crashing to summoning demons through your nasal passages.

As for why it's undefined behavior instead of simply "undefined/arbitrary value", there are a number of CPU architectures that have additional flag bits in their representation for various types. A modern example would be the Itanium, which has a "Not a Thing" bit in its registers; of course, the C standard drafters were considering some older architectures.

Attempting to work with a value with these flag bits set can result in a CPU exception in an operation that really shouldn't fail (eg, integer addition, or assigning to another variable). And if you go and leave a variable uninitialized, the compiler might pick up some random garbage with these flag bits set - meaning touching that uninitialized variable may be deadly.

What happens to uninitialized variables in C/C++?

Q.1) What happens if an uninitialized variable is used in say an operation? Will it crash/ will the code fail to compile?

Many compilers will warn you about code that improperly uses the value of an uninitialized variable. Many compilers have an option that says "treat warnings as errors". So depending on the compiler you're using and the option flags you invoke it with, the code might fail to compile, although we can't say that it will fail to compile.

If the code does compile, and you try to run it, it's obviously impossible to predict what will happen. In most cases the variable will start out containing an "indeterminate" value. Whether that indeterminate value will cause your program to work correctly, or work incorrectly, or crash, is anyone's guess. If the variable is an integer and you try to do some math on it, you'll probably just get a weird answer. But if the variable is a pointer and you try to indirect on it, you're quite likely to get a crash.

It's often said that uninitialized local variables start out containing "random garbage", but that can be misleading, as evidenced by the number of people who post questions here pointing out that, in their program where they tried it, the value wasn't random, but was always 0 or was always the same. So I like to say that uninitialized local variables never start out holding what you expect. If you expected them to be random, you'll find that (at least on any given day) they're repeatable and predictable. But if you expect them to be predictable (and, god help you, if you write code that depends on it), then by jingo, you'll find that they're quite random.

Whether use of an uninitialized variable makes your program formally undefined turns out to be a complicated question. But you might as well assume that it does, because it's a case you want to avoid just as assiduously as you avoid any other dangerous, undefined behavior.

See this old question and this other old question for more (much more!) information on the fine distinctions between undefined and indeterminate behavior in this case.

Q.2) Will C and C++ standards differ in how they treat an uninitialized variable?

They might differ. As I alluded to above, and at least in C, it turns out that not all uses of uninitialized local variables are formally undefined. (Some are merely "indeterminate".) But the passages quoted from the C++ standards by other answers here make it sound like it's undefined there all the time. Again, for practical purposes, the question probably doesn't matter, because as I said, you'll want to avoid it no matter what.

Q.3) Regarding similar queries, how and where can I find an 'official' answer? Is it practical for an amateur to look up the C and C++ standards?

It is not always easy to obtain copies of the standards (let alone official ones, which often cost money), and the standards can be difficult to read and to properly interpret, but yes, given effort, anyone can obtain, read, and attempt to answer questions using the standards. You might not always make the correct interpretation the first time (and you may therefore need to ask for help), but I wouldn't say that's a reason not to try. (For one thing, anyone can read any document and end up not making the correct interpretation the first time; this phenomenon is not limited to amateur programmers reading complex language standards documents!)

What are the dangers of uninitialised variables?

These variables could contain any value if you don't initialize them and reading them in an uninitialized stated is undefined behavior. (except if they are zero initalized)

And if you forgot to initialize one of them, and reading from it by accident results in the value you expect it should have on your current system configuration (due to undefined behavior), then your program might behave unpredictable/unexpected after a system update, on a different system or when you do changes in your code.

And these kinds of errors are hard to debug. So even if you set them at runtime it is suggested to initialize them to known values so that you have a controlled environment with predictable behavior.

There are a few exceptions, e.g. if you set the variable right after you declared it and you can't set it directly, like if you set its value using a streaming operator.

(Why) is using an uninitialized variable undefined behavior?

Yes this behavior is undefined but for different reasons than most people are aware of.

First, using an unitialized value is by itself not undefined behavior, but the value is simply indeterminate. Accessing this then is UB if the value happens to be a trap representation for the type. Unsigned types rarely have trap representations, so you would be relatively safe on that side.

What makes the behavior undefined is an additional property of your variable, namely that it "could have been declared with register" that is its address is never taken. Such variables are treated specially because there are architectures that have real CPU registers that have a sort of extra state that is "uninitialized" and that doesn't correspond to a value in the type domain.

Edit: The relevant phrase of the standard is 6.3.2.1p2:

If the lvalue designates an object of automatic storage duration that
could have been declared with the register storage class (never had
its address taken), and that object is uninitialized (not declared
with an initializer and no assignment to it has been performed prior
to use), the behavior is undefined.

And to make it clearer, the following code is legal under all circumstances:

unsigned char a, b;
memcpy(&a, &b, 1);
a -= a;
  • Here the addresses of a and b are taken, so their value is just
    indeterminate.
  • Since unsigned char never has trap representations
    that indeterminate value is just unspecified, any value of unsigned char could
    happen.
  • At the end a must hold the value 0.

Edit2: a and b have unspecified values:

3.19.3 unspecified value

valid value of the relevant type where this International Standard imposes no requirements on which value
is chosen in any instance

Uninitialized variable behaviour in C++

How's this possible when the program always assign a free memory
location to a variable? How could it be something rather than zero?

Let's take a look at an example practical implementation.

Let's say it utilizes stack to keep local variables.

void
foo(void)
{
int foo_var = 42;
}

void
bar(void)
{
int bar_var;
printf("%d\n", bar_var);
}

int
main(void)
{
bar();
foo();
bar();
}

Totally broken code above illustrates the point. After we call foo, certain location on the stack where foo_var was placed is set to 42. When we call bar, bar_var occupies that exact location. And indeed, executing the code results in printing 0 and 42, showing that bar_var value cannot be relied upon unless initialized.

Now it should be clear that local variable initialisation is required. But could main be an exception? Is there anything which could play with the stack and in result give us a non-zero value?

Yes. main is not the first function executed in your program. In fact there is tons of work required to set everything up. Any of this work could have used the stack and leave some non-zeros on it. Not only you can't expect the same value on different operating systems, it may very well suddenly change on the very system you are using right now. Interested parties can google for "dynamic linker".

Finally, the C language standard does not even have the term stack. Having a "place" for local variables is left to the compiler. It could even get random crap from whatever happened to be in a given register. It really can be totally anything. In fact, if an undefined behaviour is triggered, the compiler has the freedom to do whatever it feels like.



Related Topics



Leave a reply



Submit