Convert a Static Library to a Shared Library (Create Libsome.So from Libsome.A): Where's My Symbols

Convert a Static Library to a Shared Library (create libsome.so from libsome.a): where's my symbols?

Assuming you're using the GNU linker, you need to specify the --whole-archive option so that you'll get all the contents of the static archive. Since that's an linker option, you'll need -Wl to tell gcc to pass it through to the linker:

g++ -std=c++98 -fpic -g -O1 -shared -o libsome.so -Wl,--whole-archive libsome.a

If you were doing something more complicated where you want all of library some but only the part of library support needed by libsome, you would want to turn off whole archive after you've used it on libsome:

... -Wl,--whole-archive libsome.a -Wl,--no-whole-archive libsupport.a

If you're not using the GNU linker, you'll need to see if your linker supports it and what it's called. On the Sun linker, it's called -z allextract and -z defaultextract.

Convert a Static Library to a Shared Library?

Does this (with appropriate -L's of course)

gcc -shared -o megalib.so foo.o bar.o -la_static_lib -lb_static_lib

Not do it?

Keep all exported symbols when creating a shared library from a static library

What you observe results when some of the global symbol definitions in some of
the object files archived in libxxx.a were compiled with the function attribute
or variable attribute visibility("hidden")

This attribute has the effect that when the object file containing the
the global symbol definition is linked into a shared library:

  • The linkage of the symbol is changed from global to local in the static symbol table (.symtab) of the output shared library,
    so that when that shared library is linked with anything else, the linker cannot see the definition of the symbol.
  • The symbol definition is not added to the dynamic symbol table (.dynsym) of the output shared library (which by default it would be)
    so that when the shared library is loaded into a process, the loader is likewise unable to find a definition of the symbol.

In short, the global symbol definition in the object file is hidden for the purposes of dynamic linkage.

Check this out with:

$ readelf -s libxxx.a | grep HIDDEN

and I expect you to get hits for the unexported global symbols. If you don't,
you need read no further because I have no other explanation of what you see
and wouldn't count on any workaround I suggested not to shoot you in the foot.

Here is an illustration:

a.c

#include <stdio.h>

void aa(void)
{
puts(__func__);
}

b.c

#include <stdio.h>

void __attribute__((visibility("hidden"))) bb(void)
{
puts(__func__);
}

de.c

#include <stdio.h>

void __attribute__((visibility("default"))) dd(void)
{
puts(__func__);
}

void ee(void)
{
puts(__func__);
}

We'll compile a.c and b.c like so:

$ gcc -Wall -c a.c b.c

And we can see that symbols aa and ab are defined and global in their respective object files:

$ nm --defined-only a.o b.o

a.o:
0000000000000000 T aa
0000000000000000 r __func__.2361

b.o:
0000000000000000 T bb
0000000000000000 r __func__.2361

But we can also observe this difference:

$ readelf -s a.o

Symbol table '.symtab' contains 13 entries:
Num: Value Size Type Bind Vis Ndx Name
...
10: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 aa
...

as compared with:

$ readelf -s b.o

Symbol table '.symtab' contains 13 entries:
Num: Value Size Type Bind Vis Ndx Name
...
10: 0000000000000000 19 FUNC GLOBAL HIDDEN 1 bb
...

aa is a GLOBAL symbol with DEFAULT visibility and bb is a GLOBAL
symbol with HIDDEN visibility.

We'll compile de.c differently:

$ gcc -Wall -fvisibility=hidden -c de.c

Here, we're instructing the compiler that any symbol shall be given hidden
visibility unless a countervailing visibility attribute is specified for
it in the source code. And accordingly we see:

$ readelf -s de.o

Symbol table '.symtab' contains 15 entries:
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
...
11: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 dd
...
14: 0000000000000013 19 FUNC GLOBAL HIDDEN 1 ee

Archiving these object files in a static library changes them in no way:

$ ar rcs libabde.a a.o b.o de.o

And then if we link all of them into a shared library:

$ gcc -o libabde.so -shared -Wl,--whole-archive libabde.a -Wl,--no-whole-archive

we find that:

$ readelf -s libabde.so | egrep '(aa|bb|dd|ee|Symbol table)'
Symbol table '.dynsym' contains 8 entries:
6: 0000000000001105 19 FUNC GLOBAL DEFAULT 12 aa
7: 000000000000112b 19 FUNC GLOBAL DEFAULT 12 dd
Symbol table '.symtab' contains 59 entries:
45: 0000000000001118 19 FUNC LOCAL DEFAULT 12 bb
51: 000000000000113e 19 FUNC LOCAL DEFAULT 12 ee
54: 0000000000001105 19 FUNC GLOBAL DEFAULT 12 aa
56: 000000000000112b 19 FUNC GLOBAL DEFAULT 12 dd

bb and ee, which were GLOBAL with HIDDEN visibility in the object files,
are LOCAL in the static symbol of libabde.so and are absent altogether
from its dynamic symbol table.

In this light, you may wish to re-evaluate your mission:

The symbols that have been given hidden visibility in the object files in libxxx.a have
been hidden because the person who compiled them had a reason for
wishing to conceal them from dynamic linkage. Do you have a countervailing need
to export them for dynamic linkage? Or do you maybe just want to export them because
you've noticed that they're not exported and don't know why not?

If you nonetheless want to unhide the hidden symbols, and cannot change the source code
of the object files archived in libxxx.a, your least worst resort is to:

  • Extract each object file from libxxx.a
  • Doctor it to replace HIDDEN with DEFAULT visibility on its global definitions
  • Put it into a new archive libyyy.a
  • Then use libyyy.a instead of libxxx.a.

The binutils tool for doctoring object files is objcopy.
But objcopy has no operations to directly manipulate the dynamic visibility of
a symbol and you'd have to settle for a circuitous kludge that "achieves the effect
of" unhiding the hidden symbols:

  • With objcopy --redefine-sym, rename each hidden global symbol S as, say, __hidden__S.
  • With objcopy --add-symbol, add a new global symbol S that has the same value as __hidden_S
    but gets DEFAULT visibility by default.

ending up with two symbols with the same definition: the original hidden one
and a new unhidden alias for it.

Preferable to that would a means of simply and solely changing the visibility of a symbol in
an ELF object file
, and a means is to hand in the LIEF library (Library to Instrument Executable Formats) -
Swiss Army Chainsaw for object and executable file alterations1.

Here is a Python script that calls on pylief, the LIEF Python module, to unhide the
hidden globals in an ELF object file:

unhide.py

#!/usr/bin/python
# unhide.py - Replace hidden with default visibility on global symbols defined
# in an ELF object file

import argparse, sys, lief
from lief.ELF import SYMBOL_BINDINGS, SYMBOL_VISIBILITY, SYMBOL_TYPES

def warn(msg):
sys.stderr.write("WARNING: " + msg + "\n")

def unhide(objfile_in, objfile_out = None, namedsyms=None):
if not objfile_out:
objfile_out = objfile_in
binary = lief.parse(objfile_in)
allsyms = { sym.name for sym in binary.symbols }
selectedsyms = set([])
nasyms = { sym.name for sym in binary.symbols if \
sym.type == SYMBOL_TYPES.NOTYPE or \
sym.binding != SYMBOL_BINDINGS.GLOBAL or \
sym.visibility != SYMBOL_VISIBILITY.HIDDEN }
if namedsyms:
namedsyms = set(namedsyms)
nosyms = namedsyms - allsyms
for nosym in nosyms:
warn("No symbol " + nosym + " in " + objfile_in + ": ignored")
for sym in namedsyms & nasyms:
warn("Input symbol " + sym + \
" is not a hidden global symbol defined in " + objfile_in + \
": ignored")
selectedsyms = namedsyms - nosyms
else:
selectedsyms = allsyms

selectedsyms -= nasyms
unhidden = 0;
for sym in binary.symbols:
if sym.name in selectedsyms:
sym.visibility = SYMBOL_VISIBILITY.DEFAULT
unhidden += 1
print("Unhidden: " + sym.name)
print("{} symbols were unhidden".format(unhidden))
binary.write(objfile_out)

def get_args():
parser = argparse.ArgumentParser(
description="Replace hidden with default visibility on " + \
"global symbols defined in an ELF object file.")
parser.add_argument("ELFIN",help="ELF object file to read")
parser.add_argument("-s","--symbol",metavar="SYMBOL",action="append",
help="Unhide SYMBOL. " + \
"If unspecified, unhide all hidden global symbols defined in ELFIN")
parser.add_argument("--symfile",
help="File of whitespace-delimited symbols to unhide")
parser.add_argument("-o","--out",metavar="ELFOUT",
help="ELF object file to write. If unspecified, rewrite ELFIN")
return parser.parse_args()

def main():
args = get_args()
objfile_in = args.ELFIN
objfile_out = args.out
symlist = args.symbol
if not symlist:
symlist = []
symfile = args.symfile
if symfile:
with open(symfile,"r") as fh:
symlist += [word for line in fh for word in line.split()]
unhide(objfile_in,objfile_out,symlist)

main()

Usage:

$ ./unhide.py -h
usage: unhide.py [-h] [-s SYMBOL] [--symfile SYMFILE] [-o ELFOUT] ELFIN

Replace hidden with default visibility on global symbols defined in an ELF
object file.

positional arguments:
ELFIN ELF object file to read

optional arguments:
-h, --help show this help message and exit
-s SYMBOL, --symbol SYMBOL
Unhide SYMBOL. If unspecified, unhide all hidden
global symbols defined in ELFIN
--symfile SYMFILE File of whitespace-delimited symbols to unhide
-o ELFOUT, --out ELFOUT
ELF object file to write. If unspecified, rewrite
ELFIN

And here is a shell script:

unhide.sh

#!/bin/bash

OLD_ARCHIVE=$1
NEW_ARCHIVE=$2
OBJS=$(ar t $OLD_ARCHIVE)
for obj in $OBJS; do
rm -f $obj
ar xv $OLD_ARCHIVE $obj
./unhide.py $obj
done
rm -f $NEW_ARCHIVE
ar rcs $NEW_ARCHIVE $OBJS
echo "$NEW_ARCHIVE made"

that takes:

  • $1 = Name of an existing static library
  • $2 = Name for a new static library

and creates $2 containing the object files from $1, each modified
with unhide.py to unhide all of its hidden global definitions.

Back with our illustration, we can run:

$ ./unhide.sh libabde.a libnew.a
x - a.o
0 symbols were unhidden
x - b.o
Unhidden: bb
1 symbols were unhidden
x - de.o
Unhidden: ee
1 symbols were unhidden
libnew.a made

and confirm that worked with:

$ readelf -s libnew.a | grep HIDDEN; echo Done
Done
$ readelf -s libnew.a | egrep '(aa|bb|dd|ee)'
10: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 aa
10: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 bb
11: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 dd
14: 0000000000000013 19 FUNC GLOBAL DEFAULT 1 ee

Finally if we relink the shared library with the new archive

$  gcc -o libabde.so -shared -Wl,--whole-archive libnew.a -Wl,--no-whole-archive

all of the global symbols from the archive are exported:

$ readelf --dyn-syms libabde.so | egrep '(aa|bb|dd|ee)'
6: 0000000000001105 19 FUNC GLOBAL DEFAULT 12 aa
7: 000000000000112b 19 FUNC GLOBAL DEFAULT 12 dd
8: 0000000000001118 19 FUNC GLOBAL DEFAULT 12 bb
9: 000000000000113e 19 FUNC GLOBAL DEFAULT 12 ee

[1]
Download C/C++/Python libraries

Debian/Ubuntu provides C/C++ dev package lief-dev.

Creating both static and shared C++ libraries

The common approach that I've seen is, in fact, compiling your source twice, once with PIC and once without. If you don't do that, you either wind up with PIC overhead in the static library, or a shared object that can't be relocated by the OS (effectively meaning it's NOT shared across multiple clients of the library).

In compiling C++, how to make a .a file to .so?

If you are building the library, you are using CMake (as I guess from your question), and it is defined like this:

add_library(name-of-library
source1.cpp
source2.cpp
)

You can add the type of library you want to build after the name of the library. It can be STATIC or SHARED. So if you want to build a shared library (.so), then the above should be transformed like this:

add_library(name-of-library SHARED
source1.cpp
source2.cpp
)

Hope this helps.

undefined reference to function in shared library created directly from static library

I have figured it out. My feeling was correct -- it is a stupid mistake:

The declaration of the sim() function in main.c needs to be put into a extern "C" block, since I am using g++ as the compiler.

extern "C" {
extern uint64_t sim(uint64_t *a1, uint64_t *a2, uint64_t *a3, uint64_t len);
}
int main(int argc, char **argv)
{
... // preparing a1, a2, a3, len
uint64_t act_sum = sim(a1, a2, a3, len);
...
}

The linker error is gone after this change.

Another shorter declaration that works:

extern "C" uint64_t sim(uint64_t *a1, uint64_t *a2, uint64_t *a3, uint64_t len);


Related Topics



Leave a reply



Submit