What Happens to Global and Static Variables in a Shared Library When It Is Dynamically Linked

What happens to global and static variables in a shared library when it is dynamically linked?

This is a pretty famous difference between Windows and Unix-like systems.

No matter what:

Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).

So, the key issue here is really visibility.

In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.

Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.

In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).

To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:

#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif

MY_DLL_EXPORT int my_global;

When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.

In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.

Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).

And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).

Does using shared libraries lead to having a single instance of global variables?

prog is linking to libA and libB is dynamically, but both of those link to libX statically. (I assume two instances?)

In this case, the answer depends on which symbols are exported from libA.so and libB.so.

If the variable (let's call it glob) has static linkage, then it will not be exported and you will have two separate instances.

Likewise, if the variable doesn't have static linkage, but libX is compiled with e.g. -fvisibility-hidden, or if either libA.so or libB.so is linked with a linker script which prevents the glob from being exported, you will have two separate instances.

However, if the variable has global linkage and its visibility is not restricted via one of the above mechanisms, then (by default) it will be exported from both libA.so and libB.so, and in that case all references to that variable will bind to whichever library is loaded first.

Update:

will there be two instances of that variable in memory, but just the first one is accessible, or the linker will not reserve any space at all for the second variable?

There will be two instances in memory.

When the linker builds libA.so, or libB.so, it has no idea what other libraries exist, and so it must reserve space in the readable and writable segment (the segment into which .data and .bss sections usually go) of the corresponding library.

At runtime, the loader mmaps the entire segment, and thus has no chance of not reserving memory space for the variable in each library.

But when the code references the variable at runtime, the loader will resolve all such references to the first symbol it encounters.

Note: above is the default behavior on ELF systems. Windows DLLs behave differently, and linking libraries with -Bsymbolic may change the outcome of symbol resolution as well.

prog is linking to libA and libB is dynamically, and both of those link to libX dynamically. (I assume one instance?)

Correct.

prog is linking to libA and libB is statically, and both of those link to libX statically. (I assume one instance again?)

This is an impossible scenario: you can't link libA.a against libX.a.

But when linking prog against libA.a, libB.a and libX.a, yes: you will end up with one instance of glob.

How to share a global variable between a main process and a dynamic library via a static library (macOS)?

Doing more experiments and tweaking the linker flags led me to some other SO questions, including this and this.

Instead of -rdynamic that works on Linux, this is what works on macOS:

The -undefined dynamic_lookup has to be added to the linker flags of the dynamic library.

In my example, the change is as follows:

# It is important that we DO NOT link against static_lib and at the same time
# the -undefined dynamic_lookup is provided.
# target_link_libraries(dynamic_lib static_lib)
target_link_options(dynamic_lib PRIVATE -undefined dynamic_lookup)

The output that I see now is:

Hello, World!
SHARED: 0x109ead030 123
SHARED: 0x109ead030 123

Process finished with exit code 0

Both static variables and global variables show different addresses in dynamic library and static library on Linux?

The first command you showed builds a shared object d.so. Based on the context of your question, I surmise that you also intended to link with d.so, but your second command seems to be missing that part. I'm assuming that it's a typo, as this is the only explanation for the program output you showed -- that A.cpp is both linked to directly, and is also built into your d.so library.

Given that, quoting from the article you linked:

Object code routines used by both should not be duplicated in each.
This is especially true for code which use static variables such as
singleton classes. A static variable is global and thus can only be
represented once. Including it twice will provide unexpected results.

But that's exactly the rule you seem to be breaking, you're representing the statically-scoped instance of the A class twice, in your d.so, and in your main application executable.

So, that seems to be the indicated outcome: "unexpected results".

Will static map(variables) be freed multiple times if the static library which contains the map is linked with executable and dynamic library?

Static libraries are linked at compile time, whereas dynamic libraries are linked at run-time. That said, each piece of compiled code must have the static library baked into it - the library in question will be statically linked to both the .exe and the .dll.

That is, when the DLL is compiled, it receives its own linkage and copy of the static library in question. The executable, using the static library, also receives its own copy of the static library upon compilation.

For this reason, both the .dll and .exe will have their own separate instance of the static library running. So, any variables made with the static lib in the .exe will be independent of the ones made within the .dll, and will not be freed because the two copies will not interact.

What Happens to Global and Static Variables in a Shared Library When It Is Dynamically Linked