Why is statically linking glibc discouraged?
The reasons given in other answers are correct, but they are not the most important reason.
The most important reason why glibc should not be statically linked, is that it makes extensive internal use of dlopen
, to load NSS (Name Service Switch) modules and iconv
conversions. The modules themselves refer to C library functions. If the main program is dynamically linked with the C library, that's no problem. But if the main program is statically linked with the C library, dlopen
has to go load a second copy of the C library to satisfy the modules' load requirements.
This means your "statically linked" program still needs a copy of libc.so.6
to be present on the file system, plus the NSS or iconv
or whatever modules themselves, plus other dynamic libraries that the modules might need, like ld-linux.so.2
, libresolv.so.2
, etc. This is not what people usually want when they statically link programs.
It also means the statically linked program has two copies of the C library in its address space, and they might fight over whose stdout
buffer is to be used, who gets to call sbrk
with a nonzero argument, that sort of thing. There is a bunch of defensive logic inside glibc to try to make this work, but it's never been guaranteed to work.
You might think your program doesn't need to worry about this because it doesn't ever call getaddrinfo
or iconv
, but locale support uses iconv
internally, which means any stdio.h
function might trigger a call to dlopen
, and you don't control this, the user's environment variable settings do.
And if your program does call iconv
, for example, then things get even worse, especially when a “statically linked” executable is built on one distro, and then copied to another. The iconv
modules are sometimes located in different places on different distros, so an executable that was built, say, on a Red Hat distro may fail to run properly on a Debian one, which is exactly the opposite of what people want from statically linked executables.
Why cant you statically link dynamic libraries?
Why is this the case?
Most linkers (AIX linker is a notable exception) discard information in the process of linking.
For example, suppose you have foo.o
with foo
in it, and bar.o
with bar
in it. Suppose foo
calls bar
.
After you link foo.o
and bar.o
together into a shared library, the linker merges code and data sections, and resolves references. The call from foo
to bar
becomes CALL $relative_offset
. After this operation, you can no longer tell where the boundary between code that came from foo.o
and code that came from bar.o
was, nor the name that CALL $relative_offset
used in foo.o
-- the relocation entry has been discarded.
Suppose now you want to link foobar.so
with your main.o
statically, and suppose main.o
already defines its own bar
.
If you had libfoobar.a
, that would be trivial: the linker would pull foo.o
from the archive, would not use bar.o
from the archive, and resolve the call from foo.o
to bar
from main.o
.
But it should be clear that none of above is possible with foobar.so
-- the call has already been resolved to the other bar
, and you can't discard code that came from bar.o
because you don't know where that code is.
On AIX it's possible (or at least it used to be possible 10 years ago) to "unlink" a shared library and turn it back into an archive, which could then be linked statically into a different shared library or a main executable.
If
foo.o
andbar.o
are linked into afoobar.so
, wouldn't it make sense that the call fromfoo
tobar
is always resolved to the one inbar.o
?
This is one place where UNIX shared libraries work very differently from Windows DLLs. On UNIX (under common conditions), the call from foo
to bar
will resolve to the bar
in main executable.
This allows one to e.g. implement malloc
and free
in the main a.out
, and have all calls to malloc
use that one heap implementation consistently. On Windows you would have to always keep track of "which heap implementation did this memory come from".
The UNIX model is not without disadvantages though, as the shared library is not a self-contained mostly hermetic unit (unlike a Windows DLL).
Why would you want to resolve it to another
bar
frommain.o
?
If you don't resolve the call to main.o
, you end up with a totally different program, compared to linking against libfoobar.a
.
Why are __stat & __fstat linked statically?
I would love to understand: Why this is the case?
This answer explains why that is the case.
You'll need to wrap __xstat
instead.
Finding haskell executable if statically linked via glibc or musl
One way of finding it, although it's limited to haskell based executable is using the --info
option:
Example:
$ ./tldr +RTS --info -RTS
[("GHC RTS", "YES")
,("GHC version", "8.6.5")
,("RTS way", "rts_thr")
,("Build platform", "x86_64-alpine-linux")
,("Build architecture", "x86_64")
,("Build OS", "linux")
,("Build vendor", "alpine")
,("Host platform", "x86_64-alpine-linux")
,("Host architecture", "x86_64")
,("Host OS", "linux")
,("Host vendor", "alpine")
,("Target platform", "x86_64-alpine-linux")
,("Target architecture", "x86_64")
,("Target OS", "linux")
,("Target vendor", "alpine")
,("Word size", "64")
,("Compiler unregisterised", "NO")
,("Tables next to code", "YES")
]
From the x86_64-apline-linux
, I can confirm that the build was based on Alpine Linux which is based on musl. You can explicitly confirm via ldd
that it is indeed statically linked then:
$ ldd ./tldr
not a dynamic executable
Related Topics
Writing Function Definition in Header Files in C++
Returning Const Reference to Local Variable from a Function
Why Can a T* Be Passed in Register, But a Unique_Ptr<T> Cannot
At What Point Is It Worth Using a Database
How to Force Cache Coherency on a Multicore X86 Cpu
Conventions for Accessor Methods (Getters and Setters) in C++
Using Boost Thread and a Non-Static Class Function
Receiving Rtsp Stream Using Ffmpeg Library
Can the C Preprocessor Be Used to Tell If a File Exists
Get Absolute Value Without Using Abs Function Nor If Statement
Libraries in /Usr/Local/Lib Not Found
Unique Hardware Id in MAC Os X
How to Detect Ip Address Change Programmatically in Linux
How to Define Several Include Path in Makefile
Differencebetween Imagemagick and Graphicsmagick