Zero Initialization and Static Initialization of Local Scope Static Variable

zero initialization and static initialization of local scope static variable

I read several posts on C++ initialization from Google, some of which direct me here on StackOverflow. The concepts I picked from those posts are as follows:

  • The order of initialization of C++ is:
    1. Zero Initialization;
    2. Static Initialization;
    3. Dynamic Initialization.

Yes, indeed there are 3 phases (in the Standard). Let us clarify them before continuing:

  • Zero Initialization: the memory is filled with 0s at the byte level.
  • Constant Initialization: a pre-computed (compile-time) byte pattern is copied at the memory location of the object
  • Static Initialization: Zero Initialization followed by Constant Initialization
  • Dynamic Initialization: a function is executed to initialize the memory

A simple example:

int const i = 5;     // constant initialization
int const j = foo(); // dynamic initialization
  • Static objects (variables included) are first Zero-initialized, and then Static-initialized.

Yes and no.

The Standard mandates that the objects be first zero-initialized and then they are:

  • constant initialized if possible
  • dynamically initialized otherwise (the compiler could not compute the memory content at compile-time)

Note: in case of constant initialization, the compiler might omit to first zero-initialized memory following the as-if rule.

I have several inquiries as to the initialization issue (storage class issue may be related as well):

  • Global objects (defined without static keyword) are also static objects, right?

Yes, at file scope the static object is just about the visibility of the symbol. A global object can be referred to, by name, from another source file whilst a static object name is completely local to the current source file.

The confusion stems from the reuse of the world static in many different situations :(

  • Global objects are also initialized like static objects by two steps like above, right?

Yes, as are local static objects in fact.

  • What is the Static Initialization? Does it refer to initializing static objects (defined with static keyword)?

No, as explained above it refers to initializing objects without executing a user-defined function but instead copying a pre-computed byte pattern over the object's memory. Note that in the case of objects that will later be dynamically initialized, this is just zero-ing the memory.

  • I also read that objects defined within block (i.e. in a function) with static keyword is initialized when the execution thread first enters the block! This means that local static objects are not initialized before main function execution. This means they are not initialized as the two steps mentioned above, right?

They are initialized with the two steps process, though indeed only the first time execution pass through their definition. So the process is the same but the timing is subtly different.

In practice though, if their initialization is static (ie, the memory pattern is a compile-time pattern) and their address is not taken they might be optimized away.

Note that in case of dynamic initialization, if their initialization fails (an exception is thrown by the function supposed to initialize them) it will be re-attempted the next time flow-control passes through their definition.

  • Dynamic initialization refers to initialization of objects created by new operator, right? It might refer to initialization like myClass obj = myClass(100); or myClass obj = foo();

Not at all, it refers to initialization requiring the execution of a user defined function (note: std::string has a user-defined constructor as far as the C++ language is concerned).

EDIT: My thanks to Zach who pointed to me I erroneously called Static Initialization what the C++11 Standard calls Constant Initialization; this error should now be fixed.

c++ initialization static variable in local scope

A variable with static storage duration where the initial value is known at compile time, zero initialized or constant initialized is Static Initialized and initialized before everything else (With zero initialization preceding constant initialization in C++11 and older Standard revisions). After all, if the compiler and linker know exactly what value is going into that variable at compile time, why wouldn't they just put the value there?

Statically initialized variables are not a problem. If you're assigning a constant or a zero, the variable doesn't depend on anything else. That leaves dynamic initialization, and If I'm reading [basic.start.static] in the C++ Standard correctly, static initialization all happens before any dynamic initialization in any translation unit. The problem is with dynamically initialized variables with static storage duration interacting across multiple translation units. You can guarantee the order of initialization within a translation unit, in order of definition, but you cannot guarantee the order in which translation units are initialized.

Is it also true to 'local' static variable declared in function?

No. static local variables have a well defined initial initialization order. Dynamic initialization will be occur on first use and they can't be split across translation units, and that eliminates the ambiguity in non-local initialization ordering.

And if I have several static variables inside the same local scope, is it safe to use early declared one to initialize late declared one?

Yes. Again we have a well defined initialization order. Dynamically initialized variables will be initialized in order and statically initialized variables are already initialized. You can mess this up with a thread, but C++11 and better ensure that one thread cannot interrupt the initialization of a static variable. If a thread interrupts between the initialization of two static variables, it's on you whether that's safe or not, but the first variable will still be initialized before the second.

Is it safe to use global declared static variable to initialize 'local' static variable?

Not always. A non-local variable is allocated and initialized before main, so normally they are initialized and ready for use before you get a chance to call the function containing the static local variable.

But what about initializing with a function that contains a static variable that depends on a variable from a different translation unit?

Say in A.cpp we have

int A_variable  = something_dynamic();

and in B.cpp we have

int  func()
{
static int local_static = A_variable;
return local_static;
}
int B_variable = func();

Initialization of B_variable may happen before initialization of A_variable. This will call func and set local_static to the as-yet undefined value in B_variable. Ooops.

Why file scope static variables have to be zero-initialized?

The behaviour on the Operating Systems where C was developed has shaped these Standard stipulations. As applications load, the OS loader provides some memory for the BSS. It's desirable to clear it to zeros because if some other process had been using that memory earlier, the program you're starting could snoop on the prior process's memory contents, potentially seeing passwords, conversations or other data. Not every early or simple OS cares about this, but most do, so on most the initialisation is effectively "free" as it's a task the OS will do anyway.

Having this default of 0 makes it easy for the implementation to see refer to flags set during dynamic initialisation, as there will be no uninitialised memory read and consequent undefined behaviour. For example, given...

void f() { static int n = g(); }

...the compiler/implementation may implicitly add something like a static bool __f_statics_initialised variable too - which "luckily" defaults to 0 / false due to the zeroing behaviour - along with initialisation code akin to (a possibly thread safe version of)...

if (!__f_statics_initialised)
{
n = g();
__f_statics_initialised = true;
}

For the above scenario the initialisation is done on first call, but for global variables it's done in an unspecified per-object ordering, sometime before main() is invoked. In that scenario, having some object-specific initialisation code and dynamic initialisation able to differentiate statics in uninitialised state from those they know need to be set to non-zero values makes it easier to write robust start-up code. For example, functions can check if a non-local static pointer is still 0, and new an object for it if so.

It's also noteworthy that many CPUs have highly efficient instructions to zero out large swathes of memory.

How to force the initialization of a static local variable before main?

I suppose you want to achieve a syntax similar to:

DEFINE_FUNC(void, foo, (double x)) {
return x;
}

... and have the boilerplate autogenerated. That's actually very simple to do if you bring the Register above the function, with the help of a declaration:

#define DEFINE_FUNC(Ret, name, args)   \
Ret name args; \
Register register_##name##_([] { \
return reg(&name, #name, ...); \
}); \
Ret name args

Is the initialization of global variable the same as the initialization of static variable inside a function

But is it still initialized with two methods: first zero
initialization, then default initialization, just like the first s?

Assuming, the "as-if"-rule effectively forces the compiler to consider the existence of this variable, yes. Excerpts from the standard:

Zero initialization is performed in the following situations:

  1. For every named variable with static or thread-local storage
    duration that is not subject to constant initialization, before any
    other initialization.

Default initialization is performed in three situations:

  1. when a variable with automatic, static, or thread-local storage
    duration is declared with no initializer;

As molbdnilo said within the comments, you should avoid thinking in terms of binary files, kernel and segments here totally if you are only interested in the behavior the standard claims.

C++ static variables initialization order

The first scenario is well-defined in [basic.start.init]/2:

Variables with static storage duration (3.7.1) or thread storage duration (3.7.2) shall be zero-initialized (8.5) before any other initialization takes place.

Constant initialization is performed:

  • if each full-expression (including implicit conversions) that appears in the initializer of a reference with static or thread storage duration is a constant expression (5.19) and the reference is bound to an lvalue designating an object with static storage duration or to a temporary (see 12.2);
  • if an object with static or thread storage duration is initialized by a constructor call, if the constructor is a constexpr constructor, if all constructor arguments are constant expressions (including conversions), and if, after function invocation substitution (7.1.5), every constructor call and full-expression in the mem-initializers and in the brace-or-equal-initializers for non-static data members is a constant expression;
  • if an object with static or thread storage duration is not initialized by a constructor call and if every full-expression that appears in its initializer is a constant expression.

Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization. Static initialization shall be performed before any dynamic initialization takes place. (...)

(Emphasis mine)

The upshot of this fairly lengthy paragraph is that

int n = 2;

is static initialization, while

int k = n;

is dynamic initialization (because n is not a constant expression), and therefore n is initialized before k even if it appears later in the code.

The same logic applies in the case of the Base::static_constructor example -- because the constructor of Base::static_constructor is not constexpr, Base::constructor is dynamically initialized, whereas Base::i is statically initialized. The initialization of Base::i therefore takes place before the initialization of Base::constructor.

On the other hand, the second case with

int n = func();

puts you squarely in the territory of unspecified behavior, and it is quite explicitly mentioned in [basic.start.init]/3:

An implementation is permitted to perform the initialization of a non-local variable with static storage duration as a static initialization even if such initialization is not required to be done statically, provided that

  • the dynamic version of the initialization does not change the value of any other object of namespace scope prior to its initialization, and
  • the static version of the initialization produces the same value in the initialized variable as would be produced by the dynamic initialization if all variables not required to be initialized statically were initialized dynamically.

[Note: As a consequence, if the initialization of an object obj1 refers to an object obj2 of namespace scope potentially requiring dynamic initialization and defined later in the same translation unit, it is unspecified whether the value of obj2 used will be the value of the fully initialized obj2 (because obj2 was statically initialized) or will be the value of obj2 merely zero-initialized. For example,

inline double fd() { return 1.0; }
extern double d1;
double d2 = d1; // unspecified:
// may be statically initialized to 0.0 or
// dynamically initialized to 0.0 if d1 is
// dynamically initialized, or 1.0 otherwise
double d1 = fd(); // may be initialized statically or dynamically to 1.0

-- end note]

Why are static variables auto-initialized to zero?

It is required by the standard (§6.7.8/10).

There's no technical reason it would have to be this way, but it's been that way for long enough that the standard committee made it a requirement.

Leaving out this requirement would make working with static variables somewhat more difficult in many (most?) cases. In particular, you often have some one-time initialization to do, and need a dependable starting state so you know whether a particular variable has been initialized yet or not. For example:

int foo() { 
static int *ptr;

if (NULL == ptr)
// initialize it
}

If ptr could contain an arbitrary value at startup, you'd have to explicitly initialize it to NULL to be able to recognize whether you'd done your one-time initialization yet or not.

Static function variable initialization order in the same function

Quoting from cppreference (emphasis mine):

Variables declared at block scope with the specifier static have static storage duration but are initialized the first time control passes through their declaration (unless their initialization is zero- or constant-initialization, which can be performed before the block is first entered). On all further calls, the declaration is skipped.

Since control flows top-to-bottom, the variables will indeed be initialized in declaration order.



Related Topics



Leave a reply



Submit