_Attribute_((Weak)) and Static Libraries

attribute((weak)) and static libraries

To explain what's going on here, let's talk first about your original source files, with

a.h (1):

void foo() __attribute__((weak));

and:

a.c (1):

#include "a.h"
#include <stdio.h>

void foo() { printf("%s\n", __FILE__); }

The mixture of .c and .cpp files in your sample code is irrelevant to the
issues, and all the code is C, so we'll say that main.cpp is main.c and
do all compiling and linking with gcc:

$ gcc -Wall -c main.c a.c b.c
ar rcs a.a a.o
ar rcs b.a b.o

First let's review the differences between a weakly declared symbol, like
your:

void foo() __attribute__((weak));

and a strongly declared symbol, like

void foo();

which is the default:

When a weak reference to foo (i.e. a reference to weakly declared foo) is linked in a program, the
linker need not find a definition of foo anywhere in the linkage: it may remain
undefined. If a strong reference to foo is linked in a program,
the linker needs to find a definition of foo.
A linkage may contain at most one strong definition of foo (i.e. a definition
of foo that declares it strongly). Otherwise a multiple-definition error results.
But it may contain multiple weak definitions of foo without error.
If a linkage contains one or more weak definitions of foo and also a strong
definition, then the linker chooses the strong definition and ignores the weak
ones.
If a linkage contains just one weak definition of foo and no strong
definition, inevitably the linker uses the one weak definition.
If a linkage contains multiple weak definitions of foo and no strong
definition, then the linker chooses one of the weak definitions arbitrarily.

Next, let's review the differences between inputting an object file in a linkage
and inputting a static library.

A static library is merely an ar archive of object files that we may offer to
the linker from which to select the ones it needs to carry on the linkage.

When an object file is input to a linkage, the linker unconditionally links it
into the output file.

When static library is input to a linkage, the linker examines the archive to
find any object files within it that provide definitions it needs for unresolved symbol references
that have accrued from input files already linked. If it finds any such object files
in the archive, it extracts them and links them into the output file, exactly as
if they were individually named input files and the static library was not mentioned at all.

With these observations in mind, consider the compile-and-link command:

gcc main.c a.o b.o

Behind the scenes gcc breaks it down, as it must, into a compile-step and link
step, just as if you had run:

gcc -c main.c     # compile
gcc main.o a.o b.o  # link

All three object files are linked unconditionally into the (default) program ./a.out. a.o contains a
weak definition of foo, as we can see:

$ nm --defined a.o
0000000000000000 W foo

Whereas b.o contains a strong definition:

$ nm --defined b.o
0000000000000000 T foo

The linker will find both definitions and choose the strong one from b.o, as we can
also see:

$ gcc main.o a.o b.o -Wl,-trace-symbol=foo
main.o: reference to foo
a.o: definition of foo
b.o: definition of foo
$ ./a.out
b.c

Reversing the linkage order of a.o and b.o will make no difference: there's
still exactly one strong definition of foo, the one in b.o.

By contrast consider the compile-and-link command:

gcc main.cpp a.a b.a

which breaks down into:

gcc -c main.cpp     # compile
gcc main.o a.a b.a  # link

Here, only main.o is linked unconditionally. That puts an undefined weak reference
to foo into the linkage:

$ nm --undefined main.o
                 w foo
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

That weak reference to foo does not need a definition. So the linker will
not attempt to find a definition that resolves it in any of the object files in either a.a or b.a and
will leave it undefined in the program, as we can see:

$ gcc main.o a.a b.a -Wl,-trace-symbol=foo
main.o: reference to foo
$ nm --undefined a.out
                 w __cxa_finalize@@GLIBC_2.2.5
                 w foo
                 w __gmon_start__
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
                 U __libc_start_main@@GLIBC_2.2.5
                 U puts@@GLIBC_2.2.5

Hence:

$ ./a.out
no foo

Again, it doesn't matter if you reverse the linkage order of a.a and b.a,
but this time it is because neither of them contributes anything to the linkage.

Let's turn now to the different behavior you discovered by changing a.h and a.c
to:

a.h (2):

void foo();

a.c (2):

#include "a.h"
#include <stdio.h>

void __attribute__((weak)) foo() { printf("%s\n", __FILE__); }

Once again:

$ gcc -Wall -c main.c a.c b.c
main.c: In function ‘main’:
main.c:4:18: warning: the address of ‘foo’ will always evaluate as ‘true’ [-Waddress]
 int main() { if (foo) foo(); else printf("no foo\n"); }

See that warning? main.o now contains a strongly declared reference to foo:

$ nm --undefined main.o
                 U foo
                 U _GLOBAL_OFFSET_TABLE_

so the code (when linked) must have a non-null address for foo. Proceeding:

$ ar rcs a.a a.o
$ ar rcs b.a b.o

Then try the linkage:

$ gcc main.o a.o b.o
$ ./a.out
b.c

And with the object files reversed:

$ gcc main.o b.o a.o
$ ./a.out
b.c

As before, the order makes no difference. All the object files are linked. b.o provides
a strong definition of foo, a.o provides a weak one, so b.o wins.

Next try the linkage:

$ gcc main.o a.a b.a
$ ./a.out
a.c

And with the order of the libraries reversed:

$ gcc main.o b.a a.a
$ ./a.out
b.c

That does make a difference. Why? Let's redo the linkages with diagnostics:

$ gcc main.o a.a b.a -Wl,-trace,-trace-symbol=foo
/usr/bin/x86_64-linux-gnu-ld: mode elf_x86_64
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/7/crtbeginS.o
main.o
(a.a)a.o
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/lib/x86_64-linux-gnu/libc.so.6
(/usr/lib/x86_64-linux-gnu/libc_nonshared.a)elf-init.oS
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/usr/lib/gcc/x86_64-linux-gnu/7/crtendS.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crtn.o
main.o: reference to foo
a.a(a.o): definition of foo

Ignoring the default libraries, the only object files of ours that get
linked were:

main.o
(a.a)a.o

And the definition of foo was taken from the archive member a.o of a.a:

a.a(a.o): definition of foo

Reversing the library order:

$ gcc main.o b.a a.a -Wl,-trace,-trace-symbol=foo
/usr/bin/x86_64-linux-gnu-ld: mode elf_x86_64
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/7/crtbeginS.o
main.o
(b.a)b.o
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/lib/x86_64-linux-gnu/libc.so.6
(/usr/lib/x86_64-linux-gnu/libc_nonshared.a)elf-init.oS
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/usr/lib/gcc/x86_64-linux-gnu/7/crtendS.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crtn.o
main.o: reference to foo
b.a(b.o): definition of foo

This time the object files linked were:

main.o
(b.a)b.o

And the definition of foo was taken from b.o in b.a:

b.a(b.o): definition of foo

In the first linkage, the linker had an unresolved strong reference to
foo for which it needed a definition when it reached a.a. So it
looked in the archive for an object file that provides a definition,
and found a.o. That definition was a weak one, but that didn't matter. No
strong definition had been seen. a.o was extracted from a.a and linked,
and the reference to foo was thus resolved. Next b.a was reached, where
a strong definition of foo would have been found in b.o, if the linker still needed one
and looked for it. But it didn't need one any more and didn't look. The linkage:

gcc main.o a.a b.a

is exactly the same as:

gcc main.o a.o

And likewise the linkage:

$ gcc main.o b.a a.a

is exactly the same as:

$ gcc main.o b.o

Your real problem...

... emerges in one of your comments to the post:

I want to override [the] original function implementation when linking with a testing framework.

You want to link a program inputting some static library lib1.a
which has some member file1.o that defines a symbol foo, and you want to knock out
that definition of foo and link a different one that is defined in some other object
file file2.o.

__attribute__((weak)) isn't applicable to that problem. The solution is more
elementary. You just make sure to input file2.o to the linkage before you input
lib1.a (and before any other input that provides a definition of foo).
Then the linker will resolve references to foo with the definition provided in file2.o and will not try to find any other
definition when it reaches lib1.a. The linker will not consume lib1.a(file1.o) at all. It might as well not exist.

And what if you have put file2.o in another static library lib2.a? Then inputting
lib2.a before lib1.a will do the job of linking lib2.a(file2.o) before
lib1.a is reached and resolving foo to the definition in file2.o.

Likewise, of course, every definition provided by members of lib2.a will be linked in
preference to a definition of the same symbol provided in lib1.a. If that's not what
you want, then don't like lib2.a: link file2.o itself.

Finally

Is it possible to use [the] weak attribute with static linking at all?

Certainly. Here is a first-principles use-case:

foo.h (1)

#ifndef FOO_H
#define FOO_H

int __attribute__((weak)) foo(int i)
{
    return i != 0;
}

#endif

aa.c

#include "foo.h"

int a(void)
{
    return foo(0);
}

bb.c

#include "foo.h"

int b(void)
{
    return foo(42);
}

prog.c

#include <stdio.h>

extern int a(void);
extern int b(void);

int main(void)
{
    puts(a() ? "true" : "false");
    puts(b() ? "true" : "false");
    return 0;
}

Compile all the source files, requesting a seperate ELF section for each function:

$ gcc -Wall -ffunction-sections -c prog.c aa.c bb.c

Note that the weak definition of foo is compiled via foo.h into both
aa.o and bb.o, as we can see:

$ nm --defined aa.o
0000000000000000 T a
0000000000000000 W foo
$ nm --defined bb.o
0000000000000000 T b
0000000000000000 W foo

Now link a program from all the object files, requesting the linker to
discard unused sections (and give us the map-file, and some diagnostics):

$ gcc prog.o aa.o bb.o -Wl,--gc-sections,-Map=mapfile,-trace,-trace-symbol=foo
/usr/bin/x86_64-linux-gnu-ld: mode elf_x86_64
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/7/crtbeginS.o
prog.o
aa.o
bb.o
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/lib/x86_64-linux-gnu/libc.so.6
(/usr/lib/x86_64-linux-gnu/libc_nonshared.a)elf-init.oS
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
libgcc_s.so.1 (/usr/lib/gcc/x86_64-linux-gnu/7/libgcc_s.so.1)
/usr/lib/gcc/x86_64-linux-gnu/7/crtendS.o
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/crtn.o
aa.o: definition of foo

This linkage is no different from:

$ ar rcs libaabb.a aa.o bb.o
$ gcc prog.o libaabb.a

Despite the fact that both aa.o and bb.o were loaded, and each contains
a definition of foo, no multiple-definition error results, because each definition
is weak. aa.o was loaded before bb.o and the definition of foo was linked from aa.o.

So what happened to the definition of foo in bb.o? The mapfile shows us:

mapfile (1)

...
...
Discarded input sections
...
...
 .text.foo      0x0000000000000000       0x13 bb.o
...
...

The linker discarded the function section that contained the definition
in bb.o

Let's reverse the linkage order of aa.o and bb.o:

$ gcc prog.o bb.o aa.o -Wl,--gc-sections,-Map=mapfile,-trace,-trace-symbol=foo
...
prog.o
bb.o
aa.o
...
bb.o: definition of foo

And now the opposite thing happens. bb.o is loaded before aa.o. The
definition of foo is linked from bb.o and:

mapfile (2)

...
...
Discarded input sections
...
...
 .text.foo      0x0000000000000000       0x13 aa.o
...
...

the definition from aa.o is chucked away.

There you see how the linker arbitrarily chooses one of multiple
weak definitions of a symbol, in the absence of a strong definition. It simply
picks the first one you give it and ignores the rest.

What we've just done here is effectively what the GCC C++ compiler does for us when we
define a global inline function. Rewrite:

foo.h (2)

#ifndef FOO_H
#define FOO_H

inline int foo(int i)
{
    return i != 0;
}

#endif

Rename our source files *.c -> *.cpp; compile and link:

$ g++ -Wall -c prog.cpp aa.cpp bb.cpp

Now there is a weak definition of foo (C++ mangled) in each of aa.o and bb.o:

$ nm --defined aa.o bb.o

aa.o:
0000000000000000 T _Z1av
0000000000000000 W _Z3fooi

bb.o:
0000000000000000 T _Z1bv
0000000000000000 W _Z3fooi

The linkage uses the first definition it finds:

$ g++ prog.o aa.o bb.o -Wl,-Map=mapfile,-trace,-trace-symbol=_Z3fooi
...
prog.o
aa.o
bb.o
...
aa.o: definition of _Z3fooi
bb.o: reference to _Z3fooi

and throws away the other one:

mapfile (3)

...
...
Discarded input sections
...
...
 .text._Z3fooi  0x0000000000000000       0x13 bb.o
...
...

And as you may know, every instantiation of the C++ function template in
global scope (or instantiation of a class template member function) is
an inline global function. Rewrite again:

#ifndef FOO_H
#define FOO_H

template<typename T>
T foo(T i)
{
    return i != 0;
}

#endif

Recompile:

$ g++ -Wall -c prog.cpp aa.cpp bb.cpp

Again:

$ nm --defined aa.o bb.o

aa.o:
0000000000000000 T _Z1av
0000000000000000 W _Z3fooIiET_S0_

bb.o:
0000000000000000 T _Z1bv
0000000000000000 W _Z3fooIiET_S0_

each of aa.o and bb.o has a weak definition of:

$ c++filt _Z3fooIiET_S0_
int foo<int>(int)

and the linkage behaviour is now familiar. One way:

$ g++ prog.o aa.o bb.o -Wl,-Map=mapfile,-trace,-trace-symbol=_Z3fooIiET_S0_
...
prog.o
aa.o
bb.o
...
aa.o: definition of _Z3fooIiET_S0_
bb.o: reference to _Z3fooIiET_S0_

and the other way:

$ g++ prog.o bb.o aa.o -Wl,-Map=mapfile,-trace,-trace-symbol=_Z3fooIiET_S0_
...
prog.o
bb.o
aa.o
...
bb.o: definition of _Z3fooIiET_S0_
aa.o: reference to _Z3fooIiET_S0_

Our program's behavior is unchanged by the rewrites:

$ ./a.out
false
true

So the application of the weak attribute to symbols in the linkage of ELF objects -
whether static or dynamic - enables the GCC implementation of C++ templates
for the GNU linker. You could fairly say it enables the GCC implementation of modern C++.

Why does weak attribute in gcc work differently in static library than for object files?

The purpose of weak is (in my mind) so the user can overwrite a function with their own when it is needed (or for the language internal usage, e.g., for inline),

Yes and no. I would narrow that a bit and change the focus: the purpose of weak symbols is so that a library can provide default implementations of certain functions that others of the library's functions use, yet also allow programs using that library to substitute their own implementations.

Moreover, although that does not preclude use by static libraries, it is intended primarily for dynamic libraries, because you don't need weak symbols for that purpose when you are linking static libraries. Shared libraries, on the other hand, need to be built differently and have support from the dynamic linker to allow such substitutions.

but when a library creator overwrites a function inside of the same (or underlying) archive it is not overwritten, the linker simply chooses the first definition it sees.

And? The creator of a static library has control of its contents, including their order. If they mean to provide a particular strong definition of a given symbol, then it is at minimum wasteful to put any other external definitions of that symbol into the same library.

Yes, a static linker could nevertheless cover for library authors on that score, but this scenario is not the one that weak symbols were designed to support, so why should linker maintainers be expected to expend work to implement that, and commit to maintaining it indefinitely?

Which is not consistent with the "flat" (libraryless) structure.

That depends on how you perform the link in the "flat" case. Also, linking a collection of (only) object files into a complete program is still not the scenario that weak symbols were designed to support.

why did they make it that way?

You would have to ask the designers if you want an authoritative answer, but my take is that the authors and maintainers of the GCC toolchain decided that the static link behavior you are asking about was a reasonable compromise between the complexity of the tools and the ideal semantics of weak symbols. They were likely influenced in this area by the SUN tools, which may effectively have been where the decision actually was made.

The chosen details serve the intended purpose for weak symbols just fine, and I am fairly confident that one library containing both weak and strong definitions of the same symbol was not among the use cases that guided the the tools' design. That the behavior of linking object files directly differs from that of linking them from a static library could be considered inconsistent, but that's a question of your mental model of the process. In any case, I am inclined to think that the question asked at the time was "what is the user most likely to mean?", and by no means is it clear that the answer to that would be the same for linking multiple object files (presumably all provided by the program / library author) as for linking a static library (frequently provided by a third-party).

Weak-linking with static libraries

While I don't know the exact workings of weak symbols it looks like you are getting what you ask for: if no one else is forcing the weakFunction() to be present, main() won't either. To me this makes sense: if you are trying to write code which works with facility X present as well as without it, then you don't want your code to force X to be included in your build at all costs. It looks like "weak" is meant to ask if something is present, not to request that something is present.
Maybe you can force inclusion of weak symbols with "-u weakFunction" as linker option in your case.

How to make gcc link strong symbol in static library to overwrite weak symbol?

Generally speaking: if you don't put a weak implementation into your main, the linker will resolve it at last at runtime. But if you implement it in main.c, you will only be able to override it with a strong bound (bar.c) when linking this static.

Please read http://www.bottomupcs.com/libraries_and_the_linker.html - it contains a lot of interesting stuff on this topic.

I've made a test myself:

bar.c

#include <stdio.h>

void bar()
{
        puts("bar.c: i'm the strong bar()");
}

baz.c

#include <stdio.h>

void __attribute__((weak)) bar() 
{
        puts("baz.c: i'm the weak bar()");
}

main.c

#include <stdio.h>

#ifdef V2
        void __attribute__((weak)) bar()
        {
                puts("main: i'm the build in weak bar()");
        }
#else
        void __attribute__((weak)) bar();
#endif

int main()
{
    bar();
    return 0;
}

My Makefile:

all:
    gcc -c -o bar.o bar.c
    gcc -shared -fPIC -o libbar.so bar.o
    gcc -c -o baz.o baz.c
    gcc -shared -fPIC -o libbaz.so baz.o
    gcc -o main1 main.c -L. -lbar -lbaz
    gcc -o main2 main.c -L. -lbaz -lbar
    LD_LIBRARY_PATH=. ./main1                                   # => bar.c
    LD_LIBRARY_PATH=. ./main2                                   # => baz.c
    LD_LIBRARY_PATH=. LD_PRELOAD=libbaz.so ./main1              # => baz.c (!!)
    LD_LIBRARY_PATH=. LD_PRELOAD=libbaz.so ./main2              # => baz.c
    gcc -o main3 main.c bar.o baz.o
    gcc -o main4 main.c baz.o bar.o
    ./main3                                                     # => bar.c
    ./main4                                                     # => bar.c
    gcc -DV2 -o main5 main.c -L. -lbar -lbaz
    gcc -DV2 -o main6 main.c -L. -lbaz -lbar
    LD_LIBRARY_PATH=. ./main5                                   # => main's implementation
    LD_LIBRARY_PATH=. ./main6                                   # => main's implementation
    gcc -DV2 -o main7 main.c -L. -lbar -lbaz
    gcc -DV2 -o main8 main.c -L. -lbaz -lbar
    LD_LIBRARY_PATH=. LD_PRELOAD=libbaz.so ./main7              # => main's implementation
    LD_LIBRARY_PATH=. LD_PRELOAD=libbaz.so ./main8              # => main's implementation
    gcc -DV2 -o main9  main.c -L. -lbar -lbaz
    gcc -DV2 -o main10 main.c -L. -lbaz -lbar
    LD_LIBRARY_PATH=. LD_PRELOAD=libbar.so ./main9              # => main's implementation
    LD_LIBRARY_PATH=. LD_PRELOAD=libbar.so ./main10             # => main's implementation
    gcc -c bar.c
    gcc -c baz.c
    gcc -o main11 main.c bar.o baz.o
    gcc -o main12 main.c baz.o bar.o
    ./main11                                                    # => bar.c
    ./main12                                                    # => bar.c
    gcc -o main13 -DV2 main.c bar.o baz.o
    gcc -o main14 -DV2 main.c baz.o bar.o
    ./main13                                                    # => bar.c
    ./main14                                                    # => bar.c

Take a look at main1 && main2... if you don't put any weak implementation into main.c but keep the weak one in a library and the strong one in another lib., you'll be able to override the weak one if the strong lib defines a strong implementation of bar().

How exactly does attribute((constructor)) work?

It runs when a shared library is loaded, typically during program startup.
That's how all GCC attributes are; presumably to distinguish them from function calls.
GCC-specific syntax.
Yes, this works in C and C++.
No, the function does not need to be static.
The destructor runs when the shared library is unloaded, typically at program exit.

So, the way the constructors and destructors work is that the shared object file contains special sections (.ctors and .dtors on ELF) which contain references to the functions marked with the constructor and destructor attributes, respectively. When the library is loaded/unloaded the dynamic loader program (ld.so or somesuch) checks whether such sections exist, and if so, calls the functions referenced therein.

Come to think of it, there is probably some similar magic in the normal static linker so that the same code is run on startup/shutdown regardless if the user chooses static or dynamic linking.

Why the weak symbol defined in the same .a file but different .o file is not used as fall back?

these subtle behaviors

There isn't really anything subtle here.

A weak definition means: use this symbol unless another strong definition is also present, in which case use the other symbol.

Normally two same-named symbols result in a multiply-defined link error, but when all but one definitions are weak, no multiply-defined error is produced.
A weak (unresolved) reference means: don't consider this symbol when deciding whether to pull an object which defines this symbol out of archive library or not (an object may still be pulled in if it satisfies a different strong undefined symbol).

Normally if the symbol is unresolved after all objects are selected, the linker will report unresolved symbol error. But if the unresolved symbol is weak, the error is suppressed.

That's really all there is to it.

Update:

You are repeating incorrect understanding in comments.

What makes me feel subtle is, for a weak reference, the linker doesn't pull an object from an archive library, but still check a standalone object file.

This is entirely consistent with the answer above. When a linker deals with archive library, it has to make a decision: to select contained foo.o into the link or not. It is that decision that is affected by the type of reference.

When bar.o is given on the link line as a "standalone object file", the linker makes no decisions about it -- bar.o will be selected into the link.

And if that object happens to contain a definition for the weak reference, will the weak reference be also resolved by the way?

Yes.

Even the weak attribute tells the linker not to.

This is the apparent root of misunderstanding: the weak attribute doesn't tell the linker not to resolve the reference; it only tells the linker (pardon repetition) "don't consider this symbol when deciding whether to pull an object which defines this symbol out of archive library".

I think it's all about whether or not an object containing a definition for that weak reference is pulled in for linking.

Correct.

Be it a standalone object or from an archive lib.

Wrong: a standalone object is always selected into the link.

_Attribute_((Weak)) and Static Libraries