How Is Static Variable Initialization Implemented by the Compiler

How is static variable initialization implemented by the compiler?

In the compiler output I have seen, function local static variables are initialized exactly as you imagine.

Note that in general this is not done in a thread-safe manner. So if you have functions with static locals like that that might be called from multiple threads, you should take this into account. Calling the function once in the main thread before any others are called will usually do the trick.

I should add that if the initialization of the local static is by a simple constant like in your example, the compiler doesn't need to go through these gyrations - it can just initialize the variable in the image or before main() like a regular static initialization (because your program wouldn't be able to tell the difference). But if you initialize it with a function's return value, then the compiler pretty much has to test a flag indicating if the initialization has been done or something equivalent.

What happens to initialization of static variable inside a function

Although the standard does not dictate how compilers must implement behavior, most compilers do a much less sophisticated thing: they place c into static memory segment, and tell the loader to place zero into c's address. This way f comes straight to pre-initialized c, and proceeds to printing and incrementing as if the declaration line where not there.

In C++ it optionally adds code to initialize c to static initialization function, which initializes all static variables. In this case, no call is required.

In essence, this amounts to c starting its lifetime before the first call to f. You can think of c's behavior as if it were a static variable outside f() with its visibility constrained to f()'s scope.

Static variable initialization?

Why the static variables are deterministically initialized and local variables aren't?

See how the static variables are implemented. The memory for them is allocated at link time, and the initial value for them is also provided at link time. There is no runtime overhead.

On the other hand, the memory for local variables is allocated at run time. The stack has to grow. You don't know what was there before. If you want, you can clear that memory (zero it), but that would incur a runtime overhead. The C++ philosophy is "you don't pay for things you don't use", so it doesn't zero that memory by default.

OK, but why are static variables initialized to zero, and not some other value?

Well, you generally want to do something with that variable. But then how do you know if it has been initialized? You could create a static boolean variable. But then it also has to be reliably initialized to something (preferably false). How about a pointer? You'd rather want it initialized to NULL than some random garbage. How about a struct/record? It has some other data members inside. It makes sense to initialize all of them to their default values. But for simplicity, if you use the "initialize to 0" strategy, you don't have to inspect the individual members and check their types. You can just initialize the entire memory area to 0.

This is not really a technical requirement. The semantics of initialization could still be considered sane if the default value is something other than 0, but still deterministic. But then, what should that value be? You can quite easily explain why 0 is used (although indeed it sounds slightly arbitrary), but explaining -1 or 1024 seems to be even harder (especially that the variable may not be large enough to hold that value, etc).

And you can always initialize the variable explicitly.

And you always have paragraph 8.5.6 of the C++ standard which says "Every object of static storage duration shall be zero-initialized at program startup".

For more info, please refer to these other questions:

Is global memory initialized in C++?
What do the following phrases mean in C++: zero-, default- and value-initialization?

When are static and global variables initialized?

By static and global objects, I presume you mean objects with
static lifetime defined at namespace scope. When such objects
are defined with local scope, the rules are slightly different.

Formally, C++ initializes such variables in three phases:
1. Zero initialization
2. Static initialization
3. Dynamic initialization
The language also distinguishes between variables which require
dynamic initialization, and those which require static
initialization: all static objects (objects with static
lifetime) are first zero initialized, then objects with static
initialization are initialized, and then dynamic initialization
occurs.

As a simple first approximation, dynamic initialization means
that some code must be executed; typically, static
initialization doesn't. Thus:

extern int f();

int g1 = 42;    //  static initialization
int g2 = f();   //  dynamic initialization

Another approximization would be that static initialization is
what C supports (for variables with static lifetime), dynamic
everything else.

How the compiler does this depends, of course, on the
initialization, but on disk based systems, where the executable
is loaded into memory from disk, the values for static
initialization are part of the image on disk, and loaded
directly by the system from the disk. On a classical Unix
system, global variables would be divided into three "segments":

text:: The code, loaded into a write protected area. Static
variables with `const` types would also be placed here.
data:: Static variables with static initializers.
bss:: Static variables with no-initializer (C and C++) or with dynamic
initialization (C++). The executable contains no image for this
segment, and the system simply sets it all to `0` before
starting your code.

I suspect that a lot of modern systems still use something
similar.

EDIT:

One additional remark: the above refers to C++03. For existing
programs, C++11 probably doesn't change anything, but it does
add constexpr (which means that some user defined functions
can still be static initialization) and thread local variables,
which opens up a whole new can of worms.

What makes a static variable initialize only once?

Yes, it does normally translate into an implicit if statement with an internal boolean flag. So, in the most basic implementation your declaration normally translates into something like

void go( int x ) {
  static int j;
  static bool j_initialized;

  if (!j_initialized) {
    j = x;
    j_initialized = true;
  }

  ...
}

On top of that, if your static object has a non-trivial destructor, the language has to obey another rule: such static objects have to be destructed in the reverse order of their construction. Since the construction order is only known at run-time, the destruction order becomes defined at run-time as well. So, every time you construct a local static object with non-trivial destructor, the program has to register it in some kind of linear container, which it will later use to destruct these objects in proper order.

Needless to say, the actual details depend on implementation.

It is worth adding that when it comes to static objects of "primitive" types (like int in your example) initialized with compile-time constants, the compiler is free to initialize that object at startup. You will never notice the difference. However, if you take a more complicated example with a "non-primitive" object

void go( int x ) {
  static std::string s = "Hello World!";
  ...

then the above approach with if is what you should expect to find in the generated code even when the object is initialized with a compile-time constant.

In your case the initializer is not known at compile time, which means that the compiler has to delay the initialization and use that implicit if.

When do function-level static variables get allocated/initialized?

I was curious about this so I wrote the following test program and compiled it with g++ version 4.1.2.

include <iostream>
#include <string>

using namespace std;

class test
{
public:
        test(const char *name)
                : _name(name)
        {
                cout << _name << " created" << endl;
        }

        ~test()
        {
                cout << _name << " destroyed" << endl;
        }

        string _name;
};

test t("global variable");

void f()
{
        static test t("static variable");

        test t2("Local variable");

        cout << "Function executed" << endl;
}

int main()
{
        test t("local to main");

        cout << "Program start" << endl;

        f();

        cout << "Program end" << endl;
        return 0;
}

The results were not what I expected. The constructor for the static object was not called until the first time the function was called. Here is the output:

global variable created
local to main created
Program start
static variable created
Local variable created
Function executed
Local variable destroyed
Program end
local to main destroyed
static variable destroyed
global variable destroyed

C++ static variables initialization order

The first scenario is well-defined in [basic.start.init]/2:

Variables with static storage duration (3.7.1) or thread storage duration (3.7.2) shall be zero-initialized (8.5) before any other initialization takes place.

Constant initialization is performed:

if each full-expression (including implicit conversions) that appears in the initializer of a reference with static or thread storage duration is a constant expression (5.19) and the reference is bound to an lvalue designating an object with static storage duration or to a temporary (see 12.2);

if an object with static or thread storage duration is initialized by a constructor call, if the constructor is a constexpr constructor, if all constructor arguments are constant expressions (including conversions), and if, after function invocation substitution (7.1.5), every constructor call and full-expression in the mem-initializers and in the brace-or-equal-initializers for non-static data members is a constant expression;

if an object with static or thread storage duration is not initialized by a constructor call and if every full-expression that appears in its initializer is a constant expression.

Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization. Static initialization shall be performed before any dynamic initialization takes place. (...)

(Emphasis mine)

The upshot of this fairly lengthy paragraph is that

int n = 2;

is static initialization, while

int k = n;

is dynamic initialization (because n is not a constant expression), and therefore n is initialized before k even if it appears later in the code.

The same logic applies in the case of the Base::static_constructor example -- because the constructor of Base::static_constructor is not constexpr, Base::constructor is dynamically initialized, whereas Base::i is statically initialized. The initialization of Base::i therefore takes place before the initialization of Base::constructor.

On the other hand, the second case with

int n = func();

puts you squarely in the territory of unspecified behavior, and it is quite explicitly mentioned in [basic.start.init]/3:

An implementation is permitted to perform the initialization of a non-local variable with static storage duration as a static initialization even if such initialization is not required to be done statically, provided that

the dynamic version of the initialization does not change the value of any other object of namespace scope prior to its initialization, and

the static version of the initialization produces the same value in the initialized variable as would be produced by the dynamic initialization if all variables not required to be initialized statically were initialized dynamically.

[Note: As a consequence, if the initialization of an object obj1 refers to an object obj2 of namespace scope potentially requiring dynamic initialization and defined later in the same translation unit, it is unspecified whether the value of obj2 used will be the value of the fully initialized obj2 (because obj2 was statically initialized) or will be the value of obj2 merely zero-initialized. For example,
inline double fd() { return 1.0; }
extern double d1;
double d2 = d1;     // unspecified:
                    // may be statically initialized to 0.0 or
                    // dynamically initialized to 0.0 if d1 is
                    // dynamically initialized, or 1.0 otherwise
double d1 = fd();   // may be initialized statically or dynamically to 1.0
-- end note]

Why does GCC not assign the static variable when it is initialized to 0

TL:DR: GCC knows the BSS is guaranteed to be zero-initialized on the platform it's targeting so it puts zero-initialized static data there.

Big picture

The program loader of most modern operating systems gets two different sizes for each part of the program, like the data part. The first size it gets is the size of data stored in the executable file (like a PE/COFF .EXE file on Windows or an ELF executable on Linux), while the second size is the size of the data part in memory while the program is running.

If the data size for the running program is bigger than the amount of data stored in the executable file, the remaining part of the data section is filled with bytes containing zero. In your program, the .comm line tells the linker to reserve 4 bytes without initializing them, so that the OS zero-initializes them on start.

What does gcc do?

gcc (or any other C compiler) allocates zero-initialized variables with static storage duration in the .bss section. Everything allocated in that section will be zero-initialized on program startup. For allocation, it uses the comm directive, and it just specifies the size (4 bytes).

You can see the size of the main section types (code, data, bss) using the size command. If you initialize the variable with one, it is included in a data section, and occupies 4 bytes there. If you initialize it with zero (or not at all), it is instead allocated in the .bss section.

What does ld do?

ld merges all data-type section of all object files (even those from static libraries) into one data section, followed by all .bss-type sections. The executable output contains a simplified view for the operating system's program loader. For ELF files, this is the "program header". You can take a look at it using objdump -p for any format, or readelf for ELF files.

The program headers contain of entries of different type. Among them are a couple of entries with the type PT_LOAD describing the "segments" to be loaded by the operating system. One of these PT_LOAD entries is for the data area (where the .data section is linked). It contains an entry called p_filesz that specifies how many bytes for initialized variables are provided in the ELF file, and an entry called p_memsz telling the loader how much space in the address space should be reserved. The details on which sections get merged into what PT_LOAD entries differ between linkers and depend on command line options, but generally you will find a PT_LOAD entry that describes a region that is both readable and writeable, but not executable, and has a p_filesz value that is smaller than the p_memsz entry (potentially zero if there's only a .bss, no .data section). p_filesz is the size of all read+write data sections, whereas p_memsz is bigger to also provide space for zero-initialized variables.

The amount p_memsz exceeds p_filesz is the sum of all .bss sections linked into the executable. (The values might be off a bit due to alignment to pages or disk blocks)

See chapter 5 in the System V ABI specification, especially pages 5-2 and 5-3 for a description of the program header entries.

What does the operating system do?

The Linux kernel (or another ELF-compliant kernel) iterates over all entries in the program header. For each entry containing the type PT_LOAD it allocates virtual address space. It associates the beginning of that address space with the corresponding region in the executable file, and if the space is writeable, it enables copy-on-write.

If p_memsz exceeds p_filesz, the kernel arranges the remaining address space to be completely zeroed out. So the variable that got allocated in the .bss section by gcc ends up in the "tail" of the read-write PT_LOAD entry in the ELF file, and the kernel provides the zero.

Any whole pages that have no backing data can start out copy-on-write mapped to a shared physical page of zeros.

How Is Static Variable Initialization Implemented by the Compiler