Difference Between -Shared and -Wl,-Shared of the Gcc Options

Difference between -shared and -Wl,-shared of the GCC options

There is a difference between passing -shared to gcc or -shared to ld (via -Wl). Passing -shared to GCC may enable or disable other flags at link time. In particular, different crt* files might be involved.

To get more information, grep for -shared in GCC's gcc/config/ directory and subdirectories.

Edit: To give a specific example: on i386 FreeBSD, gcc -shared will link in object file crtendS.o, while without -shared, it will link in crtend.o instead. Thus, -shared and -Wl,-shared are not equivalent.

How does gcc `-shared` option affect the output?

Shared libraries and executables use the same format: they are both loadable images. However,

Shared libraries are usually position-independent, executables are often not. This affects code generation: for position-independent, you have to load globals or jump to functions using relative addresses.
Executables have an "entry point" which is where execution starts. This is usually not main(), because main() is a function, and functions return, but execution should never return from the entry point.

Now, this doesn't answer the question about what -shared does. You can ask GCC by using the -v flag. Here are the differences on my system between an invocation without and with -shared.

Parameters for `collect2` without `-shared`:

-dynamic-linker
/lib64/ld-linux-x86-64.so.2
/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/crt1.o
/usr/lib/gcc/x86_64-linux-gnu/4.7/crtbegin.o
/usr/lib/gcc/x86_64-linux-gnu/4.7/crtend.o

Parameters for `collect2` with `-shared`:

-shared
/usr/lib/gcc/x86_64-linux-gnu/4.7/crtbeginS.o
/usr/lib/gcc/x86_64-linux-gnu/4.7/crtendS.o

Observations

It looks like code generation is not affected: you still have to use -fpic or -fPIC.

You can see that crt1.o (the "C runtime") is only included when linking the executable. Using nm, we can find out what it contains:

$ nm /usr/lib/x86_64-linux-gnu/crt1.o
0000000000000000 R _IO_stdin_used
0000000000000000 D __data_start
                 U __libc_csu_fini
                 U __libc_csu_init
                 U __libc_start_main
0000000000000000 T _start
0000000000000000 W data_start
                 U main

So you can see it seems to define something to do with stdin, as well as _start (which is the entry point), and it has an undefined reference to main.

I'm not sure what the rest of the files are, but at least you know how to find them and you can poke around, or look at the source code if you like.

What's the difference between the -symbolic and -shared GCC flags?

Summary: -symbolic prevents intra-shared object function interposition

Linking with shared objects allows for a feature called symbol interposition. The idea is that you can 'interpose' a new definition of a global symbol so that it is called rather then the 'regular' definition.

One classic example is malloc(). In the most common case, malloc() is defined within libc. But you can interpose your own version of malloc by loading a library that defines that symbol before you load libc (most runtime linkers allow you to use LD_PRELOAD to specific libraries to load before the executable).

By default, any function within a shared object that is not static is a global symbol. Because of that, any functions within the shared object can be interposed on. Consider a scenario where a shared object has function high_level() and low_level() and high_level() calls low_level() as part of it's implementation and neither high_level() nor low_level() are static functions.

It's possible to interpose low_level() such that high_level() is calling a low_level() from a different shared object.

This is where -symbolic comes in. When creating your shared object, the linker will see that low_level() is defined in the same shared object as high_level() and bind the call such that it can't be interposed on. This way, you know that any calls from one function in your shared object to another in the same shared object will never be interposed on.

Is there a difference between the -Wl,option and -Xlinker option syntax for GCC?

From man gcc:

-Xlinker option
Pass option as an option to the linker. You can use this to supply system-specific linker options which GCC does not know how to recognize.

If you want to pass an option that takes a separate argument, you must use -Xlinker twice, once for the option and once for the argument. For example, to pass -assert definitions, you must write -Xlinker -assert -Xlinker definitions. It does not work to write -Xlinker "-assert definitions", because this passes the entire string as a single argument, which is not what the linker expects.

When using the GNU linker, it is usually more convenient to pass arguments to linker options using the option=value syntax than as separate arguments. For example, you can specify -Xlinker -Map=output.map rather than -Xlinker -Map -Xlinker output.map. Other linkers may not support this syntax for command-line options.

-Wl,option
Pass option as an option to the linker. If option contains commas, it is split into multiple options at the commas. You can use this syntax to pass an argument to the option.

For example, -Wl,-Map,output.map passes -Map output.map to the linker. When using the GNU linker, you can also get the same effect with -Wl,-Map=output.map.

As you can se, the only difference is that -Wl allows you to specify multiple arguments by means of a comma, like -Wl,-rpath,/my/libs, which you cannot do with -Xlinker; on the other hand, -Xlinker is maybe a bit more self-descriptive. Take your pick. Also check other compilers (nvcc comes to mind, and clang) to see if any of them agree on the syntax, and then use that for portability if that's important to you.

gcc compilation with linker - differences

-lm will link in libm. By default gcc will search for the shared library first. If the shared version is not found it will then search for the static version.
The trailing -Wl,-Bdynamic is to ensure that the shared version of standard libraries (ie, libc) are used.
-static prevents linking with shared libraries. It can be placed anywhere on the command line and will have the same effect. This is different to -Wl,-Bstatic in that -static applies to linking of all libraries whereas -Wl,-Bstatic only applies to libraries after it in the command line. Please note that -static is also different to -Wl,-static. The former is a gcc driver option and prevents all dynamic linking. The latter is an ld option and is an alias for -Wl,-Bstatic.

I don't understand -Wl,-rpath -Wl,

The -Wl,xxx option for gcc passes a comma-separated list of tokens as a space-separated list of arguments to the linker. So

gcc -Wl,aaa,bbb,ccc

eventually becomes a linker call

ld aaa bbb ccc

In your case, you want to say "ld -rpath .", so you pass this to gcc as -Wl,-rpath,. Alternatively, you can specify repeat instances of -Wl:

gcc -Wl,aaa -Wl,bbb -Wl,ccc

Note that there is no comma between aaa and the second -Wl.

Or, in your case, -Wl,-rpath -Wl,..

What's the difference between `-rpath-link` and `-L`?

Here is a demo, for GNU ld, of the difference between -L and -rpath-link -
and for good measure, the difference between -rpath-link and -rpath.

foo.c

#include <stdio.h>

void foo(void)
{
    puts(__func__);
}

bar.c

#include <stdio.h>

void bar(void)
{
    puts(__func__);
}

foobar.c

extern void foo(void);
extern void bar(void);

void foobar(void)
{
    foo();
    bar();
}

main.c

extern void foobar(void);

int main(void)
{
    foobar();
    return 0;
}

Make two shared libraries, libfoo.so and libbar.so:

$ gcc -c -Wall -fPIC foo.c bar.c
$ gcc -shared -o libfoo.so foo.o
$ gcc -shared -o libbar.so bar.o

Make a third shared library, libfoobar.so that depends on the first two;

$ gcc -c -Wall -fPIC foobar.c
$ gcc -shared -o libfoobar.so foobar.o -lfoo -lbar
/usr/bin/ld: cannot find -lfoo
/usr/bin/ld: cannot find -lbar
collect2: error: ld returned 1 exit status

Oops. The linker doesn't know where to look to resolve -lfoo or -lbar.

The -L option fixes that.

$ gcc -shared -o libfoobar.so foobar.o -L. -lfoo -lbar

The -Ldir option tells the linker that dir is one of the directories to
search for libraries that resolve the -lname options it is given. It searches
the -L directories first, in their commandline order; then it searches its
configured default directories, in their configured order.

Now make a program that depends on libfoobar.so:

$ gcc -c -Wall main.c
$ gcc -o prog main.o -L. -lfoobar
/usr/bin/ld: warning: libfoo.so, needed by ./libfoobar.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libbar.so, needed by ./libfoobar.so, not found (try using -rpath or -rpath-link)
./libfoobar.so: undefined reference to `bar'
./libfoobar.so: undefined reference to `foo'
collect2: error: ld returned 1 exit status

Oops again. The linker detects the dynamic dependencies requested by libfoobar.so
but can't satisfy them. Let's resist its advice - try using -rpath or -rpath-link -
for a bit and see what we can do with -L and -l:

$ gcc -o prog main.o -L. -lfoobar -lfoo -lbar

So far so good. But:

$ ./prog
./prog: error while loading shared libraries: libfoobar.so: cannot open shared object file: No such file or directory

at runtime, the loader can't find libfoobar.so.

What about the linker's advice then? With -rpath-link, we can do:

$ gcc -o prog main.o -L. -lfoobar -Wl,-rpath-link=$(pwd)

and that linkage also succeeds. ($(pwd) means "Print Working Directory" and just "copies" the current path.)

The -rpath-link=dir option tells the linker that when it encounters an input file that
requests dynamic dependencies - like libfoobar.so - it should search directory dir to
resolve them. So we don't need to specify those dependencies with -lfoo -lbar and don't
even need to know what they are. What they are is information already written in the
dynamic section of libfoobar.so:-

$ readelf -d libfoobar.so

Dynamic section at offset 0xdf8 contains 26 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libfoo.so]
 0x0000000000000001 (NEEDED)             Shared library: [libbar.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 ...
 ...

We just need to know a directory where they can be found, whatever they are.

But does that give us a runnable prog?

$ ./prog
./prog: error while loading shared libraries: libfoobar.so: cannot open shared object file: No such file or directory

No. Same as story as before. That's because -rpath-link=dir gives the linker the information
that the loader would need to resolve some of the dynamic dependencies of prog
at runtime - assuming it remained true at runtime - but it doesn't write that information into the dynamic section of prog.
It just lets the linkage succeed, without our needing to spell out all the recursive dynamic
dependencies of the linkage with -l options.

At runtime, libfoo.so, libbar.so - and indeed libfoobar.so -
might well not be where they are now - $(pwd) - but the loader might be able to locate them
by other means: through the ldconfig cache or a setting
of the LD_LIBRARY_PATH environment variable, e.g:

$ export LD_LIBRARY_PATH=.; ./prog
foo
bar

rpath=dir provides the linker with the same information as rpath-link=dir
and instructs the linker to bake that information into the dynamic section of
the output file. Let's try that:

$ export LD_LIBRARY_PATH=
$ gcc -o prog main.o -L. -lfoobar -Wl,-rpath=$(pwd)
$ ./prog
foo
bar

All good. Because now, prog contains the information that $(pwd) is a runtime search
path for shared libraries that it depends on, as we can see:

$ readelf -d prog

Dynamic section at offset 0xe08 contains 26 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libfoobar.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000f (RPATH)              Library rpath: [/home/imk/develop/so/scrap]
 ...                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 ...

That search path will be tried after the directories listed in LD_LIBRARY_PATH, if any are set, and before the system defaults - the ldconfig-ed directories, plus /lib and /usr/lib.

What is the 'soname' option for building shared libraries for?

soname is used to indicate what binary api compatibility your library support.

SONAME is used at compilation time by linker to determine from the library file what actual target library version. gcc -lNAME will seek for libNAME.so link or file then capture its SONAME that will certainly be more specific ( ex libnuke.so links to libnuke.so.0.1.4 that contains SONAME libnuke.so.0 ).

At run time it will link with this is then set into ELF dynamic section NEEDED, then a library with this name ( or a link to it ) should exists.
At Run time SONAME is disregarded, so only the link or the file existence is enough.

Remark: SONAME is enforced only at link/build time and not at run time.

'SONAME' of library can be seen with 'objdump -p file |grep SONAME'.
'NEEDED' of binaries can be seen with 'objdump -p file |grep NEEDED'.

[EDIT] WARNING Following is a general remark, not the one deployed in linux. See at the end.

Let's assume you have a library with libnuke.so.1.2 name and you develop a new libnuke library :

if your new library is a fix from previous without api change, you should just keep same soname, increase the version of filename. ie file will be libnuke.so.1.2.1 but soname will still be libnuke.so.1.2.
if you have a new library that only added new function but didn't break functionality and is still compatible with previous you would like to use same soname than previous plus a new suffix like .1. ie file and soname will be libnuke.so.1.2.1. Any program linked with libnuke.1.2 will still work with that one. New programs linked with libnuke.1.2.1 will only work with that one ( until new subversion come like libnuke.1.2.1.1 ).
if your new library is not compatible with any libnuke : libnuke.so.2
if your new library is compatible with bare old version : libnuke.so.1.3 [ ie still compatible with libnuke.so.1 ]

[EDIT] to complete : linux case.

In linux real life SONAME as a specific form :
lib[NAME][API-VERSION].so.[major-version]
major-version is only one integer value that increase at each major library change.
API-VERSION is empty by default

ex libnuke.so.0

Then real filename include minor versions and subversions ex : libnuke.so.0.1.5

I think that not providing a soname is a bad practice since renaming of file will change its behavior.

Difference Between -Shared and -Wl,-Shared of the Gcc Options