Why does the same executable use different RUNPATHs for different library lookups?
My question is: how are these variants on RUNPATH concocted?
Unlike older RPATH
, the RUNPATH
applies only when searching for direct dependencies of the binary.
That is, if a.out
has RUNPATH
of /foo
, and NEEDED
of libfoo.so
(located in /foo
), then libfoo.so
will be found. But if libfoo.so
itself depends on libbar.so
(also located in /foo
), and if libfoo.so
does not have RUNPATH
, then libbar.so
will not be found.
This behavior promotes "every ELF binary should be self-sufficient". In the case above, libfoo.so
is not self-sufficient (needs libbar.so
but doesn't say where to find it).
If you use RPATH
instead, the path there would apply to every search, and libbar.so
will be found. You can achieve this with -Wl,--disable-new-dtags
when linking a.out
.
Link only needed symbols when compiling an executable with a Shared Library
Details of dynamic linking and the kinds of objects involved vary across environments and toolchains. On Linux, where you say you are, and on Solaris, and several other UNIX-y platforms, you are looking at ELF objects and semantics.
So far I tried using
-Wl,--as-needed
,
-Wl,--unresolved-symbols=ignore-in-shared-libs
,
These both have their full effect at (static) link time. The first tells the linker that the libraries following it on the command line should be linked in only if they resolve at least one as-yet undefined symbol. The latter tells the linker to not worry about resolving symbols in shared libraries included in the link. That has nothing to do with the behavior of the dynamic linker when you run the program.
and opening the shared object with
dlopen
to get the function I want withdlsym
.
dlopen
instructs the dynamic linker to link in a shared object at runtime that was not specified in the binary as a required shared library. Its behavior at that point can be modulated by the flags passed to dlopen
, but the options available are not more than can be specified at link time. There is little reason to use dlopen
when you actually know at link time what libraries you need.
Are you forced to resolve every undefined symbol of a dynamic library
when linking it against an executable ?
Focusing on ELF and the GNU toolchain, no. -Wl,--unresolved-symbols=ignore-in-shared-libs
serves precisely the purpose of avoiding that. But as you've discovered, that comes with caveats.
In the first place, in every shared object, every symbol referring to data needs to be resolved at runtime by the dynamic linker, no matter how you linked the various shared objects, including the main program. This is primarily an operational consideration -- the dynamic linker has no way to defer resolving symbols referring to objects because it has no good way to trap attempts to access them.
On the other hand, it is possible to defer resolution of symbols referring to functions until their first use. In fact, this is the GNU linker's default, but you can reaffirm this by passing -Wl,-z,lazy
to gcc
when linking. Note well, however, that this sets a property of the object being linked, so you should ensure that every shared object is built with that link option (but ordinarily they are because, again, that's the default).
Additionally, you should be aware that the dynamic linker's behavior can be influenced by environment variables. In particular, lazy binding will be disabled if the dynamic linker finds LD_BIND_NOW
set to a nonempty string in the runtime environment.
A simple workaround would be to add all the dependencies of my
libraries when compiling the executable. But they're so full of
dependencies that this sometimes means adding 10+ libraries to the
command line, and this would be for something like a hundred
executable.
And what's the big deal with that, really? Surely you have a well-factored Makefile (or several) to help you, so it shouldn't be a big deal to ensure that all the libraries are linked. Right?
But you should also consider refactoring your libraries, especially if "interdependent" means there are loops in the dependency graph. Dynamic linking is different from static linking, as you've discovered, and the differences are sometimes more subtle than those you're presently struggling with. Although it is not a hard rule, I urge you to avoid creating situations where the shared objects used by one process contain among them multiple definitions of the same external symbol, especially if that symbol is actually used.
Update
The above discussion focuses on linking shared libraries to an executable, but there is another important consideration: how the libraries themselves are linked. Each ELF object, whether executable or shared library, carries its own list of needed shared libraries. The dynamic linker will recursively include all of these in the list of shared libraries to be loaded (immediately) at program startup, notwithstanding its behavior with respect to lazy binding of symbols referring to functions.
Therefore, if you want an executable not to require a given shared library X, then not only that executable itself but also every shared library it does rely upon must avoid expressing a dependency on X. If some of the shared libs require X when used in conjunction with other programs, then that puts the onus on you to link in all the needed libraries when building those programs (otherwise, you can arrange to link only direct dependencies). You can tell the GNU linker to build shared libraries this way by passing it the --allow-shlib-undefined
flag.
Here is a complete proof of concept:
main.c
int mul(int, int);
int main(void) {
return mul(2, 3);
}
mul.c
int add(int, int);
int mul(int x, int y) {
return x * y;
}
int mul2(int x, int y) {
return add(x, y) * add(x, -y);
}
Makefile
CC = gcc
LD = gcc
CFLAGS = -g -O2 -fPIC -DPIC
LDFLAGS = -Wl,--unresolved-symbols=ignore-in-shared-libs
SHLIB_LDFLAGS = -shared -Wl,--allow-shlib-undefined
all: main
main: main.o libmul.so
$(LD) $(CFLAGS) $(LDFLAGS) -o $@ $^
libmul.so: mul.o
$(LD) $(CFLAGS) $(SHLIB_LDFLAGS) -o $@ $^
clean:
rm -f main main.o libmul.so mul.o
Demo
$ make
gcc -g -O2 -fPIC -DPIC -c -o main.o main.c
gcc -g -O2 -fPIC -DPIC -c -o mul.o mul.c
gcc -g -O2 -fPIC -DPIC -shared -Wl,--allow-shlib-undefined -o libmul.so mul.o
gcc -g -O2 -fPIC -DPIC -Wl,--unresolved-symbols=ignore-in-shared-libs -o main main.o libmul.so
$ LD_LIBRARY_PATH=$(pwd) ./main
$ echo $?
6
$
Note that the -zlazy
linker option discussed in comments is omitted, as it's the default.
What's the difference between -rpath and -L?
You must be reading some outdated copies of the manpages (emphasis added):
-rpath=dir
Add a directory to the runtime library search path. This is used
when linking an ELF executable with shared objects. All -rpath
arguments are concatenated and passed to the runtime linker, which
uses them to locate shared objects at runtime.
vs.
-L searchdir
--library-path=searchdir
Add path searchdir to the list of paths that ld will search for
archive libraries and ld control scripts.
So, -L
tells ld
where to look for libraries to link against when linking. You use this (for example) when you're building against libraries in your build tree, which will be put in the normal system library paths by make install
. --rpath
, on the other hand, stores that path inside the executable, so that the runtime dynamic linker can find the libraries. You use this when your libraries are outside the system library search path.
Difference between relative path and using $ORIGIN as RPATH
If your use $ORIGIN
, the lookup is relative to the directory that contains the executable. If you specifiy a relative directory, it's relative to the current working directory, which is hardly ever what you want.
How can I set an executable's rpath and check its value after building it?
If I didnt miss something here, you are not linking any libs in your build command.
Lets say you want to link libusb.so shared library, which is located in libusb sub-folder of your current folder where is main.cpp.
I will not take any details here, about soname, linkname of lib etc, just to make clear about rpath.
rpath will provide runtime linker path to library, not for linktime, cause even shared library need to be present(accessible) in compile/link time. So, to provide your application loader with possibility to look for needed library in start time, relatively to your app folder, there is $ORIGIN variable, you can see it with readelf but only if you link some library with $ORIGIN in rpath.
Here is example based on your question:
g++ main.cpp -o main -L./libusb -Wl,-rpath,'$ORIGIN/libusb' -lusb
As you see, you need to provide -L directory for compile/link time search, and rpath for runtime linker. Now you will be able to examin all needed libs for your app using readelf and location for search.
Related Topics
Reusing Compression Dictionary
Linux: Changing File Ownership Without a Copy
Installing a Fully Functional Postgis 2.0 on Ubuntu Linux Geos/Gdal Issues
Can You Explain This Sed One-Liner
Lapack/Blas/Openblas Proper Installation from Source - Replace System Libraries with New Ones
Copy File to Backup Directory Preserving Folder Structure
Whole-System Snapshot on Operating System
Using Find But Only in Subdirectories Matching Certain Pattern
Postgres Copy Command, Binary File
How to Include Cutil.H in Linux
Cannot Push to My Github Private Repository
How to Write on Serial Port Using Qextserialport
Remove Strings by a Specific Delimiter
Init Script '/Dev/Tty: No Such Device or Address' Error on Redirect
What Is The Side Effect of Setting Tcp_Max_Tw_Buckets to a Very Small Value
Write Script to Create Multiple Users with Pre-Defined Passwords
How to Automatically Start an Application That Needs X in Linux