How to Get a Linux Coredump That Only Contains Callstack, Threads, and Local Variables

Is it possible to get a Linux coredump that only contains callstack, threads, and local variables?

You can pipe core dumps to a program, and so write your own filter.
Extract from man core

Since kernel 2.6.19, Linux supports an alternate syntax for the
/proc/sys/kernel/core_pattern file. If the first character of this
file is a pipe symbol (|), then the remainder of the line is
interpreted as a program to be executed. Instead of being written to
a disk file, the core dump is given as standard input to the program.

You can also control which mappings are written to the core dump, this maybe use to reduce the core dump size.

Since kernel 2.6.23, the Linux-specific /proc/PID/coredump_filter
file can be used to control which memory segments are written to the core dump file

Of course, all these depends of kernel version and configuration options.

see the link I've provided for examples or details.

Minimal core dump (stack trace + current frame only)

I have "solved" this issue in two ways:


  1. I installed a signal handler for SIGSEGV, and used backtrace/backtrace_symbols to print out the stack trace. I compiled my code with -rdynamic, so even after stripping the debug info I still get a backtrace with meaningful names (while keeping the executable compact enough).

    I stripped the debug info and put it in a separate file, which I will store somewhere safe, using strip; from there, I will use add22line with the info saved from the backtrace (addresses) to understand where the problem happened. This way I have to store only a few bytes.
  2. Alternatively, I found I could use the /proc/self/coredump_filter to dump no memory (setting its content to "0"): only thread and proc info, registers, stacktrace etc. are saved in the core. See more in this answer

I still lose information that could be precious (global and local variable(s) content, params..). I could easily figure out which page(s) to dump, but unfortunately there is no way to specify a "dump-these-pages" for normal core dumps (unless you are willing to go and patch the maydump() function in the kernel).

For now, I'm quite happy with there 2 solutions (it is better than nothing..) My next moves will be:

  • see how difficult would be to port Breakpad to powerpc-linux: there are already powerpc-darwin and i386-linux so.. how hard can it be? :)
  • try to use google-coredumper to dump only a few pages around the current ESP (that should give me locals and parameters) and around "&some_global" (that should give me globals).

Linux kernel: sequence of events/paths before process coredump happens

When a SIGSEGV is generated, the kernel checks if there is a handler for it. If there is, it will call it, just like any other signal. If there is a handler, no core will be generated. This happens in get_signal_to_deliver: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/kernel/signal.c#n2192

If it gets to the default action for SIGSEGV, it will generate a coredump and exit. The coredump is generated by do_coredump in fs/coredump.c: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/coredump.c#n485

Debugging core files generated on a Customer's box

What happens when a core file is generated from a Linux distro other than the one we are running in Dev? Is the stack trace even meaningful?

It the executable is dynamically linked, as yours is, the stack GDB produces will (most likely) not be meaningful.

The reason: GDB knows that your executable crashed by calling something in libc.so.6 at address 0x00454ff1, but it doesn't know what code was at that address. So it looks into your copy of libc.so.6 and discovers that this is in select, so it prints that.

But the chances that 0x00454ff1 is also in select in your customers copy of libc.so.6 are quite small. Most likely the customer had some other procedure at that address, perhaps abort.

You can use disas select, and observe that 0x00454ff1 is either in the middle of instruction, or that the previous instruction is not a CALL. If either of these holds, your stack trace is meaningless.

You can however help yourself: you just need to get a copy of all libraries that are listed in (gdb) info shared from the customer system. Have the customer tar them up with e.g.

cd /
tar cvzf to-you.tar.gz lib/libc.so.6 lib/ld-linux.so.2 ...

Then, on your system:

mkdir /tmp/from-customer
tar xzf to-you.tar.gz -C /tmp/from-customer
gdb /path/to/binary
(gdb) set solib-absolute-prefix /tmp/from-customer
(gdb) core core # Note: very important to set solib-... before loading core
(gdb) where # Get meaningful stack trace!

We then advice the Customer to run a -g binary so it becomes easier to debug.

A much better approach is:

  • build with -g -O2 -o myexe.dbg
  • strip -g myexe.dbg -o myexe
  • distribute myexe to customers
  • when a customer gets a core, use myexe.dbg to debug it

You'll have full symbolic info (file/line, local variables), without having to ship a special binary to the customer, and without revealing too many details about your sources.

How to check the jitted Java method parameters from the JVM core dump through assembly code?

You won't see Java frames with gdb backtrace command. However, you don't need to extract VM structures from a coredump manually - there are better options.

1. HotSpot Serviceability Agent

Serviceability Agent is an instrument designed specially for analyzing memory of a Java process or a coredump. It has Java API available in sa-jdi.jar supplied with a standard JDK package.

Here is an example that prints extended Java stacktraces wtih local variable info. It can also parse coredumps.

2. HotSpot debug functions

Debug builds of HotSpot JVM include special debugging functions that can be called from gdb. E.g.

  • psf() print stack frames;
  • pfl() print frame layout;
  • disnm(intptr_t addr) disassemble compiled Java method at given address;
  • pp(intptr_t addr) print Java object at given address;
  • etc. See other commands in debug.cpp.

These functions work while debugging an active process; not suitable for coredumps though.

BTW, a quick guide to building debug version of JVM by yourself.



Related Topics



Leave a reply



Submit