How to undo strip - i.e. add symbols back to stripped binary
Valgrind supports separate debug files, so you should use the answer here, and valgrind should work properly with the externalized debug file.
How to reverse the objcopy's strip with only-keep-debug?
For ELF, the elfutils
package contains a tool called eu-unstrip
that does the job. In the context of your example:
eu-unstrip binary binary.dbg
binary.dbg
now has both the binary and debug symbols. I'd include a reference to documentation if I could find any...
Separating out symbols and stripping unneeded symbols at the same time
I've just stumbled to the very same question.
Previously I just separated my built ELF-binaries with the objcopy's --only-keep-debug
, --strip-debug
, --add-gnu-debuglink
flags.
Now I need to save even more space and so I'm considering using --strip-unneeded
instead of --strip-debug
.
But like you I'm afraid that this may affect my debugging experience.
So, I've made several tests and come to the following conclusions:
--strip-unneeded
strips what is stripped by--strip-debug
and even more. I.e. 'debug' info is considered as a part of 'unneeded' info.- Debug binaries created with the
--only-keep-debug
flag, not only store info stripped by the--strip-debug
, but also info stripped by the--strip-unneeded
. - I've not noticed any difference in debugging against
--strip-debug
vs--strip-unneeded
, provided the debug binaries created with the--only-keep-debug
.
Details below:
I've created a very simple C++ project, which contains an executable and a shared library.
The library contains a globally exported function, which is called by the application.
Also the library contains several local functions (i.e. static or in the anonymous namespace), being called by the globally exported function. The code in the local functions just created a crash by just throwing an unhandled exception.
First, I've compiled both binaries with the -g -O0
flags.
Second, I've extracted the debug information from them in a separate binaries and linked these debug files to the original binaries. I.e., for both files:
objcopy --only-keep-debug $FILE $FILE.debug
objcopy --add-gnu-debuglink=$FILE.debug $FILE
After this point I had unstripped binaries also having separate correspondent linked debug binaries.
Then I've copied these files into two additional directories. In the first one I've done --strip-debug
against the original binaries and in another I've done --strip-unneeded
.
Considering file sizes, the original files where obviously the biggest, the files in strip-unneeded dir where the smallest, and the files in the strip-debug dir were in the middle.
Also, additionally running --strip-debug
against the files in the strip-unneeded dir has not changed the file sizes, meaning that --strip-debug
strips just some subset of what is stripped by --strip-unneeded
.
I've then compared section listing of all the three variants by running readelf -S
against all of them.
Looking at them, it could be seen that --strip-debug
strips the following sections: .debug_arranges
, .debug_info
, .debug_abbrev
, .debug_line
and .debug_str
, and also somewhat reduced the .symtab
and .strtab
sections.--strip-unneeded
also additionally completely removes .symtab
and .strtab
sections.
I've then run readelf -S
against the debug binaries, which I'd got with the --only-keep-debug
flag. The sections there had all the sections removed by --strip-unneeded
. So, it not only contained .debug_arranges
, .debug_info
, .debug_abbrev
, .debug_line
and .debug_str
, but also .symtab
and .strtab
. And the sizes of the sections were almost identical to their original sizes.
I've then tried to step-by-step debug all the three variants and haven't noticed any difference between them. Also I've produced crashes and core dumps with all of them and then tried to debug against the core dumps - also no difference.
Versions used:
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
GNU objcopy (GNU Binutils for Ubuntu) 2.26.1
GNU strip (GNU Binutils for Ubuntu) 2.26.1
How to give symbols from unstripped binary to gdb?
There are two main ways to do this.
One way is to start gdb on the unstripped executable, and then attach:
$ gdb unstripped
(gdb) attach 12345
This way is easy! However it has a hidden danger, which is that you might accidentally mismatch the stripped and unstripped programs, leading to a very confusing debugging session.
Another way is to take the time to properly split the debug information into a separate file when stripping. There are some instructions in the gdb manual.
With this approach, be sure to use the build-id feature. If you do this properly, then you can simply point gdb at your archive of separate debug info, and gdb will pick up the proper information automatically.
The main advantage of this approach is that it avoids the possibility of debuginfo mismatch. FWIW this is what the distros use to build their debuginfo archives.
How can dlsym successfully import function from stripped binary library?
Try readelf -s a.so
. The dynamic symbols are still there after that strip
.
(Or just switch to nm -D a.so
.)
Strip/Remove debug symbols and archive names from a static library
This script implements Sigismondo's suggestion (unpacks the archive, strips each object file individually, renames them 1000.o
, 1001.o
, etc., and repacks). The parameters for ar crus
may vary depending on your version of ar
.
#!/bin/bash
# usage: repack.sh file.a
if [ -z "$1" ]; then
echo "usage: repack file.a"
exit 1
fi
if [ -d tmprepack ]; then
/bin/rm -rf tmprepack
fi
mkdir tmprepack
cp $1 tmprepack
pushd tmprepack
basename=${1##*/}
ar xv $basename
/bin/rm -f $basename
i=1000
for p in *.o ; do
strip -d $p
mv $p ${i}.o
((i++))
done
ar crus $basename *.o
mv $basename ..
popd
/bin/rm -rf tmprepack
exit 0
Why OSX's strip can not remove weak symbols?
My first conclusion would be to jump to it being a result of radar bug 5614542 hence that weird symbol, but it's not related to it.
I'll draw some assumptions and guess from the fact that it seems that you're using nlist relocations and not new bytecode based relocations (you can check by looking for the dyld info load command), this is either built with an ancient toolchain or is a MH_OBJECT
file for a main executable that has not gone through the final linking step. I'm not 100% sure if that is the case here- but either way,
Sorry for my above assumption, but the original answer still applies unless you use really want to opt out of symbol coalescing in which case build your application with private linkage but again this template instantiation forces the symbol as weak for a very good reason, it has a static constructor and an implicitly instantiated template, it prefers safety so it keeps the symbol. You can not export it at all outside of the executable, while you have a small case here, C++ programs tend to use things like boost, or C++ libs that depend on other C++ libs, that all creates chains and eventually you end up with multiple definitions within the shared namespace just because of C++ semantics. In your small test case you can get away with it, in a larger application unless you really know what you're doing and examining things like dependency trees for dylibs, just let dyld do its job. I think my original answer still applies for a major part as it explains why your symbol is marked as weak (ODR is a C++ specific concept but it's dealt differently by different static linkers):
For a longer explanation - it's to do with C++ semantics, namely the one definition rule (ODR) which is a close but not the same concept as not being able to have duplicate strong symbols in the same namespace (I mean a link namespace, not an C++ namespace, this gets confusing very quickly).
If you want to know why it's marked as weak, it's for dyld to be able to coalesce it during dynamic linking, since reusing that template would instantiate it again (causing an ODR violation and depending on the context a link time error), as it's an implicit instantion, which may or may not require coalescing (which is not known until static or even dynamic link time, unless of course you define it as hidden in which case you have to be extremely careful since semantics will vary a lot depending on factors like whether it's a modular build or not (I mean LLVM "modules", not the Modules TS for C++).
Without it being weak, you'd be causing an ODR violation per C++ rules by defining it as hidden across more than 1 translation unit (if you reused that template, say in a header within the module, you would get duplicate symbol errors). You could get away with violating ODR since it's not actually enforced, but be prepared for some nasty surprises (ie. by using non modular builds aka "every translation unit is a module").
By defining it as weak, dyld is able to select correct definitions per final linked object be that a shared library or an executable (and don't forget about the shared cache) at runtime and bind/relocate them appropriately within the otherwise flat namespace.
The above is a lot to be able to deduce by a compiler without any form of a hint, hidden linkage is a really bad idea unless you understand the implication, you want internal
visibility if you really want to re-instantiate and copy the template every time. OSX has a fairly complicated linking model in general, a lot of landmines to step on potentially.
And if I'm right about the object file thing, you shouldn't really run strip on object files before they are fed into the static linker.
Related Topics
Getting CPU Cycles Using Rdtsc - Why Does the Value of Rdtsc Always Increase
Joining Multiple Fields in Text Files on Unix
How to Determine If a Detached Pthread Is Alive
Search and Replace with Sed When Dots and Underscores Are Present
How to Set Rpath and Runpath with Gcc/Ld
Install Mono and Monodevelop on Centos 5.X/6.X
Docker-Compose Up and User Inputs on Stdin
Prevent File Descriptors Inheritance During Linux Fork
What Is the Default Register State When Program Launches (Asm, Linux)
Linux Terminal Input: Reading User Input from Terminal Truncating Lines at 4095 Character Limit
Init Function Invocation of Drivers Compiled into Kernel
Why Does Sed Fail with International Characters and How to Fix
What's the Point of Eval/Bash -C as Opposed to Just Evaluating a Variable
Bash Command Substitution on Remote Host
How to Compile Glut + Opengl Project with Cmake and Kdevelop in Linux