How to analyse a crash dump file using GDB
The first thing to look for is the error message that you get when the program crashes. This will often tell you what kind of error occurred. For example "segmentation fault" or "SIGSEGV" almost certainly mean that your program has de-referenced a NULL or otherwise invalid pointer. If the program is written in C++, then the error message will often tell you the name of any uncaught exception.
If you aren't seeing the error message, then run the program from the command line, or pipe its output into a file.
In order for a core file to be really useful, you need to compile your program without optimisation and with debugging information. GCC needs the following options: -g -O0
. (Make sure your build doesn't have any other -O
options.)
Once you have the core file, then open it in gdb with:
gdb YOUR-APP COREFILE
Type where
to see the point where the crash occurred. You are basically in a normal debugging session - you can examine variables, move up and down the stack, switch between threads and whatever.
If your program has crashed, then it's probably an invalid memory access - so you need to look for a pointer that has zero-value, or that points to bad looking data. You might not find the problem at the very bottom of the stack, you might have to move up the stack a few levels before you find the problem.
Good luck!
Debugging kernel hang
Here are some options, depending on the specifics on your situation. If you can provide more detail about the platform and nature of the kernel mode driver it would be helpful.
Assuming you have reason to be confident in the hardware, your likely sources of lockups are locking problems in the kernel, uninitialized variables, and infinite loops with preemption disabled.
Can you configure a timer interrupt to run periodically and blink a LED? You might find it useful to see if interrupts continue to be handled while in a lockup.
Enable soft lockup detection in the Linux kernel hacking menu, and any other relevant kernel hacking features. It may take Linux a minute or two detect and report a soft lockup. Have you waited long enough to check for this?
Enable lock dependency checking in kernel hacking, and fix any reported locking errors in your driver.
Try changing the kernel preemption mode. This changes the behaviour of some system locks, in some cases turning deadlocks into less harmful locks. If it's relevant/possible, disable SMP.
Viewing registers in a crash dump
Depending on the calling convention, you can get some of the registers which are saved on the stack. For example, in the cdecl calling convention, all of the registers except for EAX, ECX, and EDX are required to be saved, either by the caller or the callee. Those three registers are clobberable, so you generally won't be able to get their values from higher up in the call stack. If a function doesn't use a register that must be saved, then it won't save it, but since it doesn't use it, that register has the same value in the next higher stack frame.
Related Topics
Profiling Arbitrary Cuda Applications
Count The Occurrence of a String in an Input File
Curl Http Post File Upload Using Curl -Data in Linux Command Line
How to Lock The Cursor to The Inside of a Window on Linux
Using Inotify in a Script to Monitor a Directory
Getting "Permission Denied" on Dirname and Basename
Bash Command Line Arguments Passed to Sed via Ssh
How to Find All Immediate Sub-Directories of The Current Directory on Linux
Snort Message - Warning: No Preprocessors Configured for Policy 0
Detecting If The Monitor Is Powered Off
Disable CPU Caches (L1/L2) on Armv8-A Linux
Hbase Does Not Run After ./Start-Hbase.Sh - Permission Denied
Arm-Linux-Gnueabi Compiler Options
How to Get Pyinstaller to Working on Ubuntu
Using Curl with Commands in Go