Analyzing CPU Registers During Kernel Crash Dump

How to understand the ARM registers dumped by kernel panic?

How the registers are used in an OS is something up to the ABI, a.k.a Application Binary Interface.

However we can give a quick, informal and simplified explanation of the dump.

I'm not an expert on Linux on ARM but some name seem quite intuitive:

  • sp is Stack Pointer. A pointer to a useful memory area called the stack.
  • fp is Frame Pointer. A pointer used by routine to access local vars.
  • lr is Link Register. A register containing the Return address of a call.
  • nzCv are the flags, If a flag is in uppercase it is set, otherwise clear.

    • n = Last result was Negative
    • z = Last result was Zero
    • C = Last result needed/produced a Carry bit
    • v = Last result Overflowed
  • IRQ on means Hardware interrupts are enabled.
  • FIRQ on means that some hardware interrupts are handled with a fast context switch.
  • Mode is the CPU mode, indicating that the code was privileged.
  • The following info are control structures for the the CPU set by the kernel.

The dump make you a favor by considering the sp, r5 and r8 register values as pointers and showing the memory at that addresses.

The block below SP: 0xc0705970: for example is a dump of the memory at 0xc0705970. Each row is formatted as follow:

  • The first column is the current address. Only the last four digit are shown as is it obvious what the full address is (ie there is no ambiguity, the addresses start from 0xc0705970).

  • The following eight columns are 32 bit values dumped from memory. Each row show you 32 byte of memory.

For example by looking at

R5: 0xe88aff80:
ff80 bf10f0b0 e8aca4c0 e88aff8c e88b1680 00000000 bf05b70c e87c3580 00000000
ffa0 bf095024 e87c3580 00000000 bf095024 e87c3580 00000000 bf095024 00000001
ffc0 00000004 ebd83000 00000793 e8cc2500 00000002 00000004 00000043 ffffffff
ffe0 40320354 be9ee8d8 00030444 40320380 20000010 00000000 70cfe821 70cfec21
0000 bf81e1f8 e88b0018 e88b000c e88e9a00 00000000 bf095024 00000000 fffffffe
0020 00000000 00000000 fffffffe 00000000 00000000 fffffffe 00000000 00000000
0040 00000001 e91dd000 00001073 0010051b 00080000 f1e4d900 00000001 00000002
0060 000000c8 6df9eca0 00008044 e8895700 00000040 00000026 00000003 0b56e8b8

You can tell that the 32 bit value r5 was pointing to was 0xbf10f0b0 or that the 32 bit value at 0xe88a0000 was 0xbf81e1f8 or that the 32 bit value at 0xe88a0028 was 0xfffffffe.

All this information are useful for the developer of the code that panicked.

Is WinDbg supposed to be so excruciatingly slow?

This is the symbol server being really slow. Other have noticed as well: https://twitter.com/BruceDawson0xB/status/772586358556667904

Your symbol path contains a local cache so it should load faster next time around, but it seems that the cache is not effective, I can't tell really why (I suspect the downloaded symbols are not a perfect match and they are being downloaded again, every time).

I would recommend modifying the _NT_SYMBOL_PATH (or whatever is the way your sympath is initialized) to SRV*C:\SymCache only, ie. do not attempt to automatically download, just use the symbols you already have cached locally. The image should open fairly fast. Only enable the symbols server if you discover missing symbols.

How does OS detect a crash of a process

On an x86-compatible processor, when EIP points to a page which does not have read permission, a page that is not mapped, an invalid instruction, or when a valid instruction tries to access a memory page without permission, or a page that is not mapped, or a divide instruction sees that the denominator is zero, or an INT instruction is executed, or a bunch of other things, it raises an exception. In the case of an exception occuring in protected mode when the current privilege level (CPL) is > 0, the following things occur:

  • Loads the values for SS and ESP from a memory section called the task state segment.

  • Pushes the values of SS, ESP, EFLAGS, CS and EIP onto the stack. The SS and ESP values are the previous ones, not the new ones from the TSS.

  • Some exceptions also push an error code onto the stack.

  • Gets the values for CS and EIP from the interrupt descriptor table and puts these values in CS and EIP.

Note that the kernel has set up these tables and segments in advance.

Then:

  • The kernel decides what to do with the exception. This depends on the specific kernel. Usually, it decides to kill your program. On Linux, you can override this default using signal handling and on Windows you can override it using Structured Exception Handling.

(This is not an exhaustive reference to x86 exception handling. This is a brief overview of the most common case.)



Related Topics



Leave a reply



Submit