How to Convert Dynamically Linked Application to Statically One

How can I convert dynamically linked application to statically one?

It is theoretically possible. You basically have to do the same job that the dynamic linker does, with some modifications, i.e.

  • dump all sections from the original file
  • resolve symbols
  • locate libraries
  • instead of loading them into memory, assemble them into a "virtual image"
  • resolve internal links
  • dump the whole thing in a independent file.

So objdump, readelf, and objcopy will be some of your friends.

The task is not easy and the result will be neither automatic, nor (probably) stable.

You may want to check out this code by someone else that tried the same, by actually intercepting the dynamic linker (i.e. all steps above, except the last) and dumping the results to disk.

It is based on this tool, so it's anyone's bet whether it works on the newest kernels.

(It probably doesn't - and you need at least to patch it to reflect the new structures. This is my attempt at doing so. Caveat emptor).

Combine Dynamically Linked Libraries into one Statically Linked Library

You have a few options:

  1. Remove the dependencies. A lot of projects have dependencies that they do not really need.

  2. Ship the dynamically linked libraries you depend on, with your own built executables or libraries.

  3. Use a package manager to provide your dependencies so users don't have to build them from sources.

There is unfortunately no easy way to turn dynamically linked libraries into static ones. If you really want to try it (knowing it probably won't work out), see here: How can I convert dynamically linked application to statically one?

Convert a dynamically linked elf binary to statically linked

Tools which may help Ermine, Statifier, and jumpstart.

Convert a statically linked elf binary to dynamically linked

What you are attempting is not possible in any automated way. At the time of static linking, all relocation information identifying calls to libc as calls to libc has been resolved and removed. If debugging symbols exist in the binary, it's possible to identify "this range of bytes in the text segment corresponds to such-and-such libc function", but there is no way to identify references to the function, which will be embedded in the instruction byte stream with no markup to identify them. You could use heuristics based on disassembly, but they would be incomplete and unreliable (possibility of both false negatives and false positives).

As far as shifting offsets, you absolutely cannot change anything about the load addresses for a static linked binary. If you need to insert headers before the load segments, you'd have to insert a whole page, and update the file offsets in the program header table (adding 1 page to them) while leaving the virtual address load offsets the same. However, since what you're trying to do is not possible overall, the offset-shifting issue is the least of your worries.

Perhaps, if the program doesn't require high performance, you could run it under qemu app-level emulation, with qemu going through the sockets emulation/wrapper.

Why cant you statically link dynamic libraries?

Why is this the case?

Most linkers (AIX linker is a notable exception) discard information in the process of linking.

For example, suppose you have foo.o with foo in it, and bar.o with bar in it. Suppose foo calls bar.

After you link foo.o and bar.o together into a shared library, the linker merges code and data sections, and resolves references. The call from foo to bar becomes CALL $relative_offset. After this operation, you can no longer tell where the boundary between code that came from foo.o and code that came from bar.o was, nor the name that CALL $relative_offset used in foo.o -- the relocation entry has been discarded.

Suppose now you want to link foobar.so with your main.o statically, and suppose main.o already defines its own bar.

If you had libfoobar.a, that would be trivial: the linker would pull foo.o from the archive, would not use bar.o from the archive, and resolve the call from foo.o to bar from main.o.

But it should be clear that none of above is possible with foobar.so -- the call has already been resolved to the other bar, and you can't discard code that came from bar.o because you don't know where that code is.

On AIX it's possible (or at least it used to be possible 10 years ago) to "unlink" a shared library and turn it back into an archive, which could then be linked statically into a different shared library or a main executable.

If foo.o and bar.o are linked into a foobar.so, wouldn't it make sense that the call from foo to bar is always resolved to the one in bar.o?

This is one place where UNIX shared libraries work very differently from Windows DLLs. On UNIX (under common conditions), the call from foo to bar will resolve to the bar in main executable.

This allows one to e.g. implement malloc and free in the main a.out, and have all calls to malloc use that one heap implementation consistently. On Windows you would have to always keep track of "which heap implementation did this memory come from".

The UNIX model is not without disadvantages though, as the shared library is not a self-contained mostly hermetic unit (unlike a Windows DLL).

Why would you want to resolve it to another bar from main.o?

If you don't resolve the call to main.o, you end up with a totally different program, compared to linking against libfoobar.a.

Statically and dynamically linking the same library

Yes, this configuration is possible.

In answer to your question as to how the system knows how to use the symbols, remember that all of the links happen at build time. After it's been built, it isn't a question of "symbols", just calls to various functions at various addresses.

When building libB.so, it sets up it's links to libA.1.0.so. It does not know or care what other applications that use it will do, it just knows how to map its own function calls.

When building the application itself, the application links to libB.so. Whatever libB.so calls is completely unknown to the application. The application also statically links to a library, which libB.so does not care about.

One gotcha: if libA uses static variables, there will be one set of statics accessible to libB.so, and a different, independent set of statics accessible to the application.



Related Topics



Leave a reply



Submit