Segfault with Rip-Relative Addressing on Linux

Segfault with RIP-relative addressing on Linux

On the mainline gnu binutils, on i386 and x86_64, the .align n directive tells the assembler to align to n bytes (however, on some architectures and platforms, it has other meanings. Consult the documentation for full details).

On OS X, the .align n directive tells the assembler to align to 2^n bytes. This is why your code works on the Mac.

If you want consistent cross-platform behavior, use the .p2align directive instead, which is supported on both platforms, and tells the assembler to align to 2^n bytes.

Segmentation fault when attempting to copy an address into a struct?

Peter Cordes noted in a comment that:

It looks like inst resb mystruct is reserving 0 bytes in the BSS, so your process doesn't have a BSS at all. But it still assembles and links somehow. I don't know what the right syntax is for sizeof() in NASM; I never use its struct syntax.

It turns out that what I needed to do was change:

act resb mystruct

...to...

act resb mystruct_size

This symbol is automatically defined by the assembler and is set to the size of the struct in bytes.

The program no longer crashes on that section of code.

Segmentation fault assembly i386:x86_64

It's segfaulting because you are attempting to write data to a code (.text) area of memory. Executable code areas are almost always marked as read-only.

Here's your code with some additional comments.

.section .data
.section .text
.globl _start

_start:
    xor %rax, %rax
    mov $70, %al
    xor %rbx, %rbx
    xor %rcx, %rcx
    ; call sys_setreuid(0,0)
    int $0x80

    jmp ender

starter:
    ; take the return address off the stack
    ; rbx will point to the /bin/sh string after the call instruction
    pop %rbx
    ; zero rax
    xor %rax, %rax
    ; save a zero byte to the end of the /bin/sh string (it's 7 characters long)...
    ; (it will segfault here because you're writing to a read-only area)
    mov %al, 0x07(%rbx)
    ; ...followed by a pointer to the string... 
    mov %rbx, 0x08(%rbx)
    ; ...followed by another zero value 
    mov %rax, 0x0c(%rbx)
    ; setup the parameters for a sys_execve call
    mov $11, %al
    lea 0x08(%rbx), %rcx
    lea 0x0c(%rbx), %rdx
    int $0x80

    ; what happens when int 0x80 returns? 
    ; you should do something here or starter will be called again

ender:
    call starter
    .string "/bin/sh"

There are other issues with the code. Consider:

mov %rbx, 0x08(%rbx)
mov %rax, 0x0c(%rbx)

%rbx is an 8 byte value, but the code only gives it 4 bytes worth of space (0x0c-0x08 = 4).
If you want to get it working, you'll need to move the string into a .data area (with some additional space after it) and change the code to make it 64-bit friendly.

How to access segment register with out linking libc.so?

Accessing a segment register is no problem, just mov eax, fs. But what you're trying to do is access thread-local storage at a small offset from the FS segment base, which libc init stuff will have asked the kernel to set up.

The simplest thing would be to just access your stack canary with a normal RIP-relative addressing mode, not relative to FS base, like GCC will do when targeting other ISAs. Only if you want to make it harder for some other exploit to reach the canary (and for its address to be separately randomizable) do you need TLS. (Or so library code can access it without the indirection of loading a pointer from the GOT, instead of only being efficient for code in the main executable.)

You can of course make the same system calls libc does to set up thread-local storage and use it, if you want to copy GCC's stack-canary code.

Fun fact: sub rax, qword fs:[0x28] is a more efficient way to check the canary than XOR - it can macro-fuse with the JCC into a single uop. That's why current GCC changed to using sub. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568 - fixed in GCC10+.

My GCC bug report actually included self-contained microbenchmark code (to prove that sub can macro-fuse even with an FS: addressing mode).

Without libc in a static executable, it sets up the FS segment so its base address is the address of a buffer so [fs: 0x28] will work. This is a basic form of TLS.

global _start
_start:

cookie equ 12345
    mov  eax, 158       ; __NR_arch_prctl
    mov  edi, 0x1002    ; ARCH_SET_FS
    lea  rsi, [buf]
    syscall

    mov  qword [fs: 0x28], cookie

...


section .bss
buf:    resb 4096         ; fs.base will point at this buffer

If the kernel enabled wrfsbase for user-space use, you could use wrfsbase rsi instead of making a system call. I think the most recent Linux kernel (5.10) maybe has started using wrfsbase itself, but I don't know if it enables user-space use of it.

(It probably doesn't toggle FSGSBASE on/off every time it uses it, so kernel usage would mean user-space can use it; the fault conditions in the manual don't mention privilege level, only the CPUID feature bit and a bit in the CR4 control register. And only in 64-bit mode; it will #UD in other modes including compat mode.)

x86_64 Assembly - Segfault when trying to edit a byte within an array in x64 assembly

I would assume it's because you're trying to access data that is in the .text section. Usually you're not allowed to write to code segment for security. Modifiable data should be in the .data section. (Or .bss if zero-initialized.)

For actual shellcode, where you don't want to use a separate section, see Segfault when writing to string allocated by db [assembly] for alternate workarounds.

Also I would never suggest using the side effects of call pushing the address after it to the stack to get a pointer to data following it, except for shellcode.

This is a common trick in shellcode (which must be position-independent); 32-bit mode needs a call to get EIP somehow. The call must have a backwards displacement to avoid 00 bytes in the machine code, so putting the call somewhere that creates a "return" address you specifically want saves an add or lea.

Even in 64-bit code where RIP-relative addressing is possible, jmp / call / pop is about as compact as jumping over the string for a RIP-relative LEA with a negative displacement.

Outside of the shellcode / constrained-machine-code use case, it's a terrible idea and you should just lea reg, [rel buf] like a normal person with the data in .data and the code in .text. (Or read-only data in .rodata.) This way you're not trying execute code next to data, or put data next to code.

(Code-injection vulnerabilities that allow shellcode already imply the existence of a page with write and exec permission, but normal processes from modern toolchains don't have any W+X pages unless you do something to make that happen. W^X is a good security feature for this reason, so normal toolchain security features / defaults must be defeated to test shellcode.)

Segfault from tiny assembly file

In DOS, ret was valid as it would return to an exit error level 0 interrupt call in the PSP. Other platforms a simple ret may not be legal.

NASM x86_64 assembly in 32-bit mode: Why does this instruction produce RIP-Relative Addressing code?

The answer is it isn't. x86-64 doesn't have RIP-relative addressing in 32-bit emulation mode (this should be obvious because RIP doesn't exist in 32-bit). What's happening is that nasm is compiling you some lovely 32-bit opcodes that you're trying to run as 64-bit. GDB is disassembling your 32-bit opcodes as 64-bit, and telling you that in 64-bit, those bytes mean a RIP-relative mov. 64-bit and 32-bit opcodes on the x86-64 overlap a lot to make use of common decoding logic in the silicon, and you're getting confused because the code that GDB is disassembling looks similar to the 32-bit code you wrote, but in reality you're just throwing garbage bytes at the processor.

This isn't anything to do with nasm. You're using the wrong architecture for the process you're in. Either use 32-bit nasm in a 32-bit process or compile your assembly code for [BITS 64].

Assembly of PIC

Just to be completely clear: the CALL instruction pushes the address of the instruction following it onto the stack and jumps to the target address. This means that

x: call start 
y:

is morally equivalent to (ignoring that we trash %rax here):

x: lea y(%rip), %rax
   push %rax
   jmp start 
y:

Conversely RET pops an address from the stack and jumps to it.

Now in your code you do popq %rsi and then later ret jumps back to whatever called you. If you just change the popq to lea str(%rip), %rsi to load %rsi with the address of str you still have the return value (address of str) on the stack! To fix your code simply manually pop the return value off the stack (add $8, %rsp) OR more sanely move str to after the function so you don't need the awkward call.

Updated with complete stand alone example:

# p.s
#
# Compile using:
# gcc -c -fPIC -o p.o p.s
# gcc -fPIC -nostdlib -o p -Wl,-estart p.o

.text
.global start # So we can use it as an entry point
start:
    movq $1, %rax #sys_write
    movq $1, %rdi
    lea str(%rip), %rsi
    movq $5, %rdx
    syscall

    mov $60, %rax #sys_exit
    mov $0, %rdi
    syscall

.data
str:
    .string "test\n"

Disassembling the code with objdump -d p reveals that the code is indeed position independent, even when using .data.

p:     file format elf64-x86-64
Disassembly of section .text:
000000000040010c <start>:
  40010c:   48 c7 c0 01 00 00 00    mov    $0x1,%rax
  400113:   48 c7 c7 01 00 00 00    mov    $0x1,%rdi
  40011a:   48 8d 35 1b 00 20 00    lea    0x20001b(%rip),%rsi        # 60013c <str>
  400121:   48 c7 c2 05 00 00 00    mov    $0x5,%rdx
  400128:   0f 05                   syscall 
  40012a:   48 c7 c0 3c 00 00 00    mov    $0x3c,%rax
  400131:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  400138:   0f 05                   syscall

Why I cannot single stepping into aeskeygenassist instruction in self-modifying code?

I assume you forgot to link with --omagic to make the .text section writable.

So mov BYTE PTR ds:0x804807f,ah segfaults, and it's right before aeskeygenassist. You can't keep single-stepping after your program crashes. (You have no handler for SIGSEGV, and the default action is to terminate your program).

When I tried this on my desktop out of curiosity, I can imagine interpreting the behaviour as single-stepping getting "stuck" before aeskeygenassist, if I ignore the segfault message!!! and the fact that trying again says "the program is no longer running".

From a GDB session:

(gdb) layout reg
(gdb) starti          # like run with an implicit breakpoint on the first instruction
(gdb) si
0x0000000000401004 in _start ()
0x0000000000401008 in _start ()     ## I kept pressing return to repeat the command
0x000000000040100c in _start ()
0x000000000040100e in roundloop ()
0x0000000000401012 in roundloop ()
0x0000000000401014 in roundloop ()    # the MOV store

Program received signal SIGSEGV, Segmentation fault.
0x0000000000401014 in roundloop ()    # still pointing at the MOV store

Notice that RIP is still pointing at the mov. 0x8048074 in your 32-bit build, 0x401014 in my 64-bit build of the same source.

From the ld manual:

-N

--omagic

Set the text and data sections to be readable and writable. Also, do not page-align the data segment, and disable linking against
shared
libraries. If the output format supports Unix style magic numbers, mark the output as "OMAGIC". Note: Although a writable text
section is
allowed for PE-COFF targets, it does not conform to the format specification published by Microsoft.

Your code works fine for me if I link with:

  nasm -felf64 aes.asm &&
  ld --omagic aes.o -o aes

Alternatively, you could make an mprotect system call to give the page containing this code PROT_READ|PROT_WRITE|PROT_EXEC.

GDB's layout reg disassembly window even updates disassembly for aeskeygenassist after its immediate is modified by store.

Also note that Self-Modifying Code (SMC) is extremely slow on modern x86. Full pipeline nuke after every store near instructions being executed. You'd be much better off unrolling with an assembler macro.

Also, you can't ret from _start under Linux; it's not a function. The stack pointer points to argc, not a return address. Make an _exit system call with int 0x80 for 32-bit code. When I say "works" I meant it reaches that ret and segfaults on code-fetch from address 1 after popping argc into RIP.

Also, use default rel for RIP-relative addressing of the store; it's more compact. Or I guess you're building a 32-bit executable out of this for some reason, based on your code addresses. I didn't notice that at first, that's why I tested as a 64-bit executable. Fortunately you used labels correctly, and aeskeygenassist is the same length in both modes, so it still works.

Analyzing segmentation fault without core file

In the past, I had to deal with this kind of restriction on several occasions. A segmentation fault or, more generally, abnormal process termination had to be investigated with the caveat that a core dump was not available.

For Linux, our platform of choice for this walkthrough, a few reasons come to mind:

Core dump generation is disabled altogether (using limits.conf or ulimit)
The target directory (current working directory or a directory in /proc/sys/kernel/core_pattern) does not exist or is inaccessible due to filesystem permissions or SELinux
The target filesystem has insufficient diskspace resulting in a partial dump

For all of those, the net result is the same: there's no (valid) core dump to use for analysis. Fortunately, a workaround exists for post-mortem debugging that has the potential to save the day, but given it's inherent limitations, your mileage may vary from case to case.

Identifying the Faulting Instruction

The following sample contains a classic use-after-free memory error:

#include <iostream>

struct Test
{
  const std::string &m_value;

  Test(const std::string &value):
    m_value(value)
  {
  }

  void print()
  {
    std::cout << m_value << std::endl;
  }
};

int main()
{
  std::string *value = new std::string("this is a test");
  Test test(*value);
  delete value;
  test.print();
  return 0;
}

After delete value, the std::string reference Test::m_value points to inaccessible memory. Therefore, running it results in a segmentation fault:

$ ./a.out
Segmentation fault

When a process terminates due to an access violation, the Linux kernel creates a log entry accessible via dmesg and, depending on the system's configuration, the syslog (usually /var/log/messages). The example (compiled with -O0) creates the following entry:

$ dmesg | grep segfault
[80440.957955] a.out[7098]: segfault at ffffffffffffffe8 ip 00007f9f2c2b56a3 sp 00007ffc3e75bc48 error 5 in libstdc++.so.6.0.19[7f9f2c220000+e9000]

The corresponding Linux kernel source from arch/x86/mm/fault.c:

    printk("%s%s[%d]: segfault at %lx ip %px sp %px error %lx",
        loglvl, tsk->comm, task_pid_nr(tsk), address,
        (void *)regs->ip, (void *)regs->sp, error_code);

The error (error_code) reveals what the trigger was. It's a CPU-specific bit set (x86). In our case, the value 5 (101 in binary) indicates that the page represented by the faulting address 0xffffffffffffffe8 was mapped but inaccessible due to page protection and a read was attempted.

The log message identifies the module that executed the faulting instruction: libstdc++.so.6.0.1. The sample was compiled without optimization, so the call to std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) was not inlined:

  400bef:       e8 4c fd ff ff          callq  400940 <_ZStlsIcSt11char_traitsIcESaIcEERSt13basic_ostreamIT_T0_ES7_RK
SbIS4_S5_T1_E@plt>

The STL performs the read access. Knowing those basics, how can we identify where the segmentation fault occurred exactly? The log entry features two essential addresses we need for doing so:

ip 00007f9f2c2b56a3 [...] error 5 in
   ^^^^^^^^^^^^^^^^ 
  libstdc++.so.6.0.19[7f9f2c220000+e9000]                                     
                      ^^^^^^^^^^^^

The first is the instruction pointer (rip) at the time of the access violation, the second is the address the .text section of the library is mapped to. By subtracting the .text base address from rip, we get the relative address of the instruction in the library and can disassemble the implementation using objdump (you can simply search for the offset):

0x7f9f2c2b56a3-0x7f9f2c220000=0x956a3

$ objdump --demangle -d /usr/lib64/libstdc++.so.6
[...]
00000000000956a0 <std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, s
td::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<ch
ar>, std::allocator<char> > const&)@@GLIBCXX_3.4>:
   956a0:       48 8b 36                mov    (%rsi),%rsi
   956a3:       48 8b 56 e8             mov    -0x18(%rsi),%rdx
   ^^^^^
   956a7:       e9 24 4e fc ff          jmpq   5a4d0 <std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@plt>
   956ac:       0f 1f 40 00             nopl   0x0(%rax)
[...]

Is that the correct instruction? We can consult GDB to confirm our analysis:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b686a3 in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /lib64/libstdc++.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-323.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64 libstdc++-4.8.5-44.el7.x86_64
(gdb) disass
Dump of assembler code for function _ZStlsIcSt11char_traitsIcESaIcEERSt13basic_ostreamIT_T0_ES7_RKSbIS4_S5_T1_E:
   0x00007ffff7b686a0 <+0>: mov    (%rsi),%rsi
=> 0x00007ffff7b686a3 <+3>: mov    -0x18(%rsi),%rdx
   0x00007ffff7b686a7 <+7>: jmpq   0x7ffff7b2d4d0 <_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l@plt>
End of assembler dump.

GDB shows the very same instruction. We can also use a debugging session to verify the read address:

(gdb) print /x $rsi-0x18
$2 = 0xffffffffffffffe8

This value matches the read address in the log entry.

Identifying the Callers

So, despite the absence of a core dump, the kernel output enables us to identify the exact location of the segmentation fault. In many scenarios, though, that is far from being enough. For one thing, we're missing the list of calls that got us to that point - the call stack or stack trace.

Without a dump in the backpack, you have two options to get hold of the callers: you can start your process using catchsegv (a glibc utility) or you can implement your own signal handler.

catchsegv serves as a wrapper, generates the stack trace, and also dumps register values and the memory map:

$ catchsegv ./a.out
*** Segmentation fault
Register dump:

 RAX: 0000000002158040   RBX: 0000000002158040   RCX: 0000000002158000
[...]
Backtrace:
/lib64/libstdc++.so.6(_ZStlsIcSt11char_traitsIcESaIcEERSt13basic_ostreamIT_T0_ES7_RKSbIS4_S5_T1_E+0x3)[0x7f1794fd36a3]
??:?(_ZN4Test5printEv)[0x400bf4]
??:?(main)[0x400b2d]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f179467a555]
??:?(_start)[0x4009e9]

Memory map:

00400000-00401000 r-xp 00000000 08:02 50331747 /home/user/a.out
[...]
7f1794f3e000-7f1795027000 r-xp 00000000 08:02 33600977 /usr/lib64/libstdc++.so.6.0.19
7f1795027000-7f1795227000 ---p 000e9000 08:02 33600977 /usr/lib64/libstdc++.so.6.0.19
7f1795227000-7f179522f000 r--p 000e9000 08:02 33600977 /usr/lib64/libstdc++.so.6.0.19
7f179522f000-7f1795231000 rw-p 000f1000 08:02 33600977 /usr/lib64/libstdc++.so.6.0.19
[...]

How does catchsegv work? It essentially injects a signal handler using LD_PRELOAD and the library libSegFault.so. If your application already happens to install a signal handler for SIGSEGV and you intend to take advantage of libSegFault.so, your signal handler needs to forward the signal to the original handler (as returned by sigaction(SIGSEGV, NULL)).

The second option is to implement the stack trace functionality yourself using a custom signal handler and backtrace(). This allows you to customize the output location and the output itself.

Based on that information, we can essentially do the same we did before (0x7f1794fd36a3-0x7f1794f3e000=0x956a3). This time around, we can go back to the callers to dig deeper. The second frame is represented by the following line:

??:?(_ZN4Test5printEv)[0x400bf4]

0x400bf4 is the address the callee returns to after Test::print(), it's located in the executable. We can visualize the call site as follows:

$ objdump --demangle -d ./a.out
[...]
  400bea:       bf a0 20 60 00          mov    $0x6020a0,%edi
  400bef:       e8 4c fd ff ff          callq  400940 <std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std:
:char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::basic_string<char, std::char_trai
ts<char>, std::allocator<char> > const&)@plt>
  400bf4:       be 70 09 40 00          mov    $0x400970,%esi
  ^^^^^^
  400bf9:       48 89 c7                mov    %rax,%rdi
  400bfc:       e8 5f fd ff ff          callq  400960 <std::ostream::operator<<(std::ostream& (*)(std::ostream&))@plt>
[...]

Note that the output of objdump matches the address in this instance because we run it against the executable, which has a default base address of 0x400000 on x86_64 - objdump takes that into account. With address space layout randomization (ASLR) enabled (compiled with -fpie, linked with -pie), the base address has to be taken into account as outlined before.

Going back further involves the same steps:

??:?(main)[0x400b2d]

$ objdump --demangle -d ./a.out
[...]
  400b1c:       e8 af fd ff ff          callq  4008d0 <operator delete(void*)@plt>
  400b21:       48 8d 45 d0             lea    -0x30(%rbp),%rax
  400b25:       48 89 c7                mov    %rax,%rdi
  400b28:       e8 a7 00 00 00          callq  400bd4 <Test::print()>
  400b2d:       b8 00 00 00 00          mov    $0x0,%eax
  ^^^^^^
  400b32:       eb 2a                   jmp    400b5e <main+0xb1>
[...]

Until now, we've been manually translating the absolute address to a relative address. Instead, the base address of the module can be passed to objdump via --adjust-vma=<base-address>. That way, the value of rip or a caller's address can be used directly.

Adding Debug Symbols

We've come a long way without a dump. For debugging to be effective, another critical puzzle piece is absent, however: debug symbols. Without them, it can be difficult to map the assembly to the corresponding source code. Compiling the sample with -O3 and without debug information illustrates the problem:

[98161.650474] a.out[13185]: segfault at ffffffffffffffe8 ip 0000000000400a4b sp 00007ffc9e738270 error 5 in a.out[400000+1000]

As a consequence of inlining, the log entry now points to our executable as the trigger. Using objdump gets us to the following:

  400a3e:       e8 dd fe ff ff          callq  400920 <operator delete(void*)@plt>
  400a43:       48 8b 33                mov    (%rbx),%rsi
  400a46:       bf a0 20 60 00          mov    $0x6020a0,%edi
  400a4b:       48 8b 56 e8             mov    -0x18(%rsi),%rdx
  ^^^^^^
  400a4f:       e8 4c ff ff ff          callq  4009a0 <std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@plt>
  400a54:       48 89 c5                mov    %rax,%rbp
  400a57:       48 8b 00                mov    (%rax),%rax

Part of the stream implementation was inlined, making it harder to identify the associated source code. Without symbols, you have to use export symbols, calls (like operator delete(void*)) and the surrounding instructions (mov $0x6020a0 loads the address of std::cout: 00000000006020a0 <std::cout@@GLIBCXX_3.4>) for the purpose of orientation.

With debug symbols (-g), more context is available by calling objdump with --source:

  400a43:       48 8b 33                mov    (%rbx),%rsi
    operator<<(basic_ostream<_CharT, _Traits>& __os,
               const basic_string<_CharT, _Traits, _Alloc>& __str)
    {
      // _GLIBCXX_RESOLVE_LIB_DEFECTS
      // 586. string inserter not a formatted function
      return __ostream_insert(__os, __str.data(), __str.size());
  400a46:       bf a0 20 60 00          mov    $0x6020a0,%edi
  400a4b:       48 8b 56 e8             mov    -0x18(%rsi),%rdx
  ^^^^^^
  400a4f:       e8 4c ff ff ff          callq  4009a0 <std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@plt>
  400a54:       48 89 c5                mov    %rax,%rbp

That worked as expected. In the real world, debug symbols are not embedded in the binaries - they are managed in separate debuginfo packages. In those circumstances, objdump ignores debug symbols even if they are installed. To address this limitation, symbols have to be re-added to the affected binary. The following procedure creates detached symbols and re-adds them using eu-unstrip from elfutils to the benefit of objdump:

# compile with debug info
g++ segv.cxx -O3 -g
# create detached debug info
objcopy --only-keep-debug a.out a.out.debug
# remove debug info from executable
strip -g a.out
# re-add debug info to executable
eu-unstrip ./a.out ./a.out.debug -o ./a.out-debuginfo
# objdump with executable containing debug info
objdump --demangle -d ./a.out-debuginfo --source

Using GDB instead of objdump

Thus far, we've been using objdump because it's usually available, even on production systems. Can we just use GDB instead? Yes, by executing gdb with the module of interest. I use 0x0x400a4b as in the previous objdump invocation:

$ gdb ./a.out
[...]
(gdb) disass 0x400a4b
Dump of assembler code for function main():
[...]
   0x0000000000400a43 <+67>:    mov    (%rbx),%rsi
   0x0000000000400a46 <+70>:    mov    $0x6020a0,%edi
   0x0000000000400a4b <+75>:    mov    -0x18(%rsi),%rdx
   0x0000000000400a4f <+79>:    callq  0x4009a0 <_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l@plt>
   0x0000000000400a54 <+84>:    mov    %rax,%rbp

In contrast to objdump, GDB can deal with external symbol information without a hitch. disass /m corresponds to objdump --source:

(gdb) disass /m 0x400a4b
Dump of assembler code for function main():
[...]
21    Test test(*value);
22    delete value;
   0x0000000000400a25 <+37>:    test   %rbx,%rbx
   0x0000000000400a28 <+40>:    je     0x400a43 <main()+67>
   0x0000000000400a3b <+59>:    mov    %rbx,%rdi
   0x0000000000400a3e <+62>:    callq  0x400920 <_ZdlPv@plt>

23    test.print();
24    return 0;
25  }
   0x0000000000400a88 <+136>:   add    $0x18,%rsp
[...]
End of assembler dump.

In case of an optimized binary, GDB might skip instructions in this mode if the source code cannot be mapped unambiguously. Our instruction at 0x400a4b is not listed. objdump never skips instructions and might skip the source context instead - an approach, that I prefer for debugging at this level. This does not mean that GDB is not useful for this task, it's just something to be aware of.

Final Thoughts

Termination reason, registers, memory map, and stack trace. It's all there without even a trace of a core dump.

Segfault with Rip-Relative Addressing on Linux