getting stacktrace from core dump
gdb /usr/bin/myapp.binary corefile
Then, use one of:
(gdb) bt
(gdb) bt full
(gdb) info threads
(gdb) thread apply all bt
(gdb) thread apply all bt full
Note that installing debug symbols for the related libraries will help
Dumping only stack trace in linux core dumps
You can set /proc/$PID/coredump_filter
to 0x10
.
See http://man7.org/linux/man-pages/man5/core.5.html
Analyzing core dump with stack corrupted
If frame 1 does not make sense at a source level, you might try looking at disassembly of frame 1. After selecting that frame, disass $pc
should show you the disassembly for the entire function, with =>
to indicate the return address (the instruction immediately after the call to frame 0).
In the case of a null function pointer dereference, the instruction for the call to frame 0 might involve a simple register dereference, in which case you'd want to understand how that register obtained the null value. In some cases including /m
in a disass
command can be helpful, although it can cause confusion because of the distinction between instruction boundaries and source line boundaries. Omitting /m
is more likely to display a meaningful return address.
The =>
in the updated disassembly (without /m
) makes sense. In any frame aside from frame 0, the pc
value (what the =>
points at in the disassembly) indicates the instruction which will execute when the next lowest numbered frame returns (which, due to the crash, did not occur in this case). The pc
value in frame 1 is not the value of the pc
register at the time of the crash, but rather the saved pc
value pushed on the stack by the call
instruction. One way to see that is to compare output from x/a $sp
in frame 0 to x/i $pc
in frame 1.
One way to interpret this disassembly is that edx
is some object, and [edx+0x14]
points into its vtable. One way the vtable might wind up with a null pointer is a memory allocation issue with a stale reference to a chunk of memory which has been deallocated and subsequently overwritten by its rightful owner (the next piece of code to allocate that chunk). If any of that is applicable here, it can work either way (the code in frame 1 might be the culprit, or it might be the victim). There are other reasons memory might be overwritten with incorrect contents, but double allocation might be a good place to start.
It probably makes sense to examine the contents of the object referenced by edx
in frame 1, to see if there are any other anomalies besides what could be an incorrect vtable. Both the print
command and the x
command (within gdb) can be useful for this. My best guess about which object is referenced by edx
, based on disass/m
output (at this writing, visible only in the edit history of the question), is _listener
, but it would be best to confirm that by further study of the disassembly (the excerpt available here does not seem to include the instruction that determines the value of edx
).
Minimal core dump (stack trace + current frame only)
I have "solved" this issue in two ways:
- I installed a signal handler for SIGSEGV, and used backtrace/backtrace_symbols to print out the stack trace. I compiled my code with -rdynamic, so even after stripping the debug info I still get a backtrace with meaningful names (while keeping the executable compact enough).
I stripped the debug info and put it in a separate file, which I will store somewhere safe, usingstrip
; from there, I will useadd22line
with the info saved from the backtrace (addresses) to understand where the problem happened. This way I have to store only a few bytes. - Alternatively, I found I could use the /proc/self/coredump_filter to dump no memory (setting its content to "0"): only thread and proc info, registers, stacktrace etc. are saved in the core. See more in this answer
I still lose information that could be precious (global and local variable(s) content, params..). I could easily figure out which page(s) to dump, but unfortunately there is no way to specify a "dump-these-pages" for normal core dumps (unless you are willing to go and patch the maydump()
function in the kernel).
For now, I'm quite happy with there 2 solutions (it is better than nothing..) My next moves will be:
- see how difficult would be to port Breakpad to powerpc-linux: there are already powerpc-darwin and i386-linux so.. how hard can it be? :)
- try to use google-coredumper to dump only a few pages around the current ESP (that should give me locals and parameters) and around "&some_global" (that should give me globals).
Related Topics
Linux Sed Command - Using Variable with Backslash
Multiple Ip Addresses on Google Cloud Compute
Excluding Directory When Creating a .Tar.Gz File
Difference Between Unix Domain Stream and Datagram Sockets
Difference Between Checkout and Export in Svn
How to Open a "-" Dashed Filename Using Terminal
Given Two Directory Trees How to Find Which Files Are the Same
Listing All User-Installed Packages in Debian
Linux: Getting Umask of an Already Running Process
Trailing Arguments with Find -Exec {} +
Bash - Find Files Older Than X Minutes and Move Them
How to Pipe or Redirect the Output of Curl -V
Preserve Colouring After Piping Grep to Grep
How to Convert an Ssl Certificate in Linux
How to Check the Bios Version or Name in Linux Through a Command Prompt