Why Does %Rbp Point to Nothing

Why does %rbp point to nothing?

RBP is a general-purpose register, so it can contain any value that you (or your compiler) wants it to contain. It is only by convention that RBP is used to point to the procedure frame. According to this convention, the stack looks like this:

Low            |====================|
addresses | Unused space |
| |
|====================| ← RSP points here
↑ | Function's |
↑ | local variables |
↑ | | ↑ RBP - x
direction |--------------------| ← RBP points here
of stack | Original/saved RBP | ↓ RBP + x
growth |--------------------|
↑ | Return pointer |
↑ |--------------------|
↑ | Function's |
| parameters |
| |
|====================|
| Parent |
| function's data |
|====================|
| Grandparent |
High | function's data |
addresses |====================|

As such, the boilerplate prologue code for a function is:

push   %rbp
mov %rsp, %rbp

This first instruction saves the original value of RBP by pushing it onto the stack, and then the second instruction sets RBP to the original value of RSP. After this, the stack looks exactly like the one depicted above, in the beautiful ASCII art.

The function then does its thing, executing whatever code it wants to execute. As suggested in the drawing, it can access any parameters it was passed on the stack by using positive offsets from RBP (i.e., RBP+x), and it can access any local variables it has allocated space for on the stack by using negative offsets from RBP (i.e., RBP-x). If you understand that the stack grows downward in memory (addresses get smaller), then this offsetting scheme makes sense.

Finally, the boilerplate epilogue code to end a function is:

leaveq

or, equivalently:

mov %rbp, %rsp
pop %rbp

This first instruction sets RSP to the value of RBP (the working value used throughout the function's code), and the second instruction pops the "original/saved RBP" off the stack, into RBP. It is no coincidence that this is precisely the opposite of what was done in the prologue code we looked at above.

Note, though, that this is merely a convention. Unless required by the ABI, the compiler is free to use RBP as a general-purpose register, with no relation to the stack pointer. This works because the compiler can just calculate the required offsets from RSP at compile time, and it is a common optimization, known as "frame pointer elision" (or "frame pointer omission"). It is especially common in 32-bit mode, where the number of available general-purpose registers is extremely small, but you'll sometimes see it in 64-bit code, too. When the compiler has elided the frame pointer, it doesn't need the prologue and epilogue code to manipulate it, so this can be omitted, too.

The reason you see all of this frame-pointer book-keeping is because you're analyzing unoptimized code, where the frame pointer is never elided because having it around often makes debugging easier (and since execution speed is not a significant concern).

The reason why it RBP is 0 upon entry to your function appears to be a peculiarity of GDB, and not something that you really need to concern yourself with. As Shift_Left notes in the comments, GDB under Linux pre-initializes all registers (except RSP) to 0 before handing off control to an application. If you had run this program outside of the debugger, and simply printed the initial value of RBP to stdout, you'd see that it would be non-zero.

But, again, the exact value shouldn't matter to you. Understanding the schematic drawing of the call stack above is the key. Assuming that frame pointers have not been elided, the compiler has no idea when it generates the prologue and epilogue code what value RBP will have upon entry, because it doesn't know where on the call stack the function will end up being called.

What is the purpose of the RBP register in x86_64 assembler?

rbp is the frame pointer on x86_64. In your generated code, it gets a snapshot of the stack pointer (rsp) so that when adjustments are made to rsp (i.e. reserving space for local variables or pushing values on to the stack), local variables and function parameters are still accessible from a constant offset from rbp.

A lot of compilers offer frame pointer omission as an optimization option; this will make the generated assembly code access variables relative to rsp instead and free up rbp as another general purpose register for use in functions.

In the case of GCC, which I'm guessing you're using from the AT&T assembler syntax, that switch is -fomit-frame-pointer. Try compiling your code with that switch and see what assembly code you get. You will probably notice that when accessing values relative to rsp instead of rbp, the offset from the pointer varies throughout the function.

unable to understand the base pointer calculation in assembly code

1) Why sub $0x10, %rsp?

It is actually subtracting 16 bytes, in other words, its making space for the two 'long' arguments. try printing 'sizeof(long)' and I'm pretty sure you'll get '8' as the answer on the machine you're on.

2) Why move register values to memory?

Again this is where the computer is loading the two long values from the registers 'rcx' and 'rdx' into the memory space it made in '1)'. 0x10 and 0x18 have a difference of 8 bytes.

3) Why is the return value stored in the mov %rax,-0x8(%rbp)?

It's stored temporarily because before leaving the function, the %rax register is used for some other computations. Therefore if it was not saved it would have been over written, and you can see that after those computations are done the value is again loaded into rax.

mov%rax,-0x8(%rbp) <--- saving
jmp 0x100401114 <absdiff+52>
...
mov %rax,-0x8(%rbp)
-0x8(%rbp),%rax" < -- retrieving

A Suggestion

I'm pretty sure you'll find this link really helpful:

https://www.recurse.com/blog/7-understanding-c-by-learning-assembly

Why is no value showing up in the rax register?

push $3 is a 64-bit push. How many bytes does the push instruction push onto the stack when I don't specify the operand size?

But that doesn't even matter because you moved RSP to point 12 bytes below the saved RBP value (which RBP points at) before doing the first push.

At process startup in a static executable run by Linux, i.e. at _start the way you apparently built your program, all the registers (except RSP) are zeroed, and stack memory below the initial RSP starts out zeroed. This is not officially guaranteed by the ABI but is in practice how Linux works. That's why you loaded zeros.

mov -4(%rbp), %rax loads 8 bytes. The low 4 bytes of that load comes from space you skipped with sub $12, %rsp. The high 4 bytes are from the bottom of the saved RBP value. Both of those things are 0 because Linux zeroed them while initializing a fresh process.

The value you load from memory into RAX was never going to be a pointer so it makes no sense as an arg to GDB's x command. x eXamines memory at a given address. What would make sense is x /16gx $rsp to dump 16 qwords above RSP.

Also note that sub $12, %rsp looks like it was naively ported from 32-bit code. That misaligns the stack. At _start, it was already aligned by 16. _start is not a function; nothing called it and you can't ret from it. You don't need to save the old RBP, or even do anything with RBP at all; one pointer to the stack (RSP) is generally enough.

At the top of a function that does get called, RSP-8 would be 16-byte aligned, so one push would re-align the stack.

Why is a returned stack-pointer replaced by a null-pointer by gcc?

Putting the return value in a pointer variable seems to change the behavior of the compiler and it generates the assembly code that returns a pointer to stack:

int* func(int i) {
int j = 3;
j += i;
int *p = &j;
return p;
}

Why are rbp and rsp called general purpose registers?

General purpose means all of these registers might be used with any instructions doing computation with general purpose registers while, for example, you cannot do whatever you want with the instruction pointer (RIP) or the flags register (RFLAGS).

Some of these registers were envisioned to be used for specific use, and commonly are. The most critical ones are the RSP and RBP.

Should you need to use them for your own purpose, you should save their contents before storing something else inside, and restore them to their original value when done.

What is exactly the base pointer and stack pointer? To what do they point?

esp is as you say it is, the top of the stack.

ebp is usually set to esp at the start of the function. Function parameters and local variables are accessed by adding and subtracting, respectively, a constant offset from ebp. All x86 calling conventions define ebp as being preserved across function calls. ebp itself actually points to the previous frame's base pointer, which enables stack walking in a debugger and viewing other frames local variables to work.

Most function prologs look something like:

push ebp      ; Preserve current frame pointer
mov ebp, esp ; Create new frame pointer pointing to current stack top
sub esp, 20 ; allocate 20 bytes worth of locals on stack.

Then later in the function you may have code like (presuming both local variables are 4 bytes)

mov [ebp-4], eax    ; Store eax in first local
mov ebx, [ebp - 8] ; Load ebx from second local

FPO or frame pointer omission optimization which you can enable will actually eliminate this and use ebp as another register and access locals directly off of esp, but this makes debugging a bit more difficult since the debugger can no longer directly access the stack frames of earlier function calls.

EDIT:

For your updated question, the missing two entries in the stack are:

var_C = dword ptr -0Ch
var_8 = dword ptr -8
var_4 = dword ptr -4
*savedFramePointer = dword ptr 0*
*return address = dword ptr 4*
hInstance = dword ptr 8h
PrevInstance = dword ptr 0C
hlpCmdLine = dword ptr 10h
nShowCmd = dword ptr 14h

This is because the flow of the function call is:

  • Push parameters (hInstance, etc.)
  • Call function, which pushes return address
  • Push ebp
  • Allocate space for locals


Related Topics



Leave a reply



Submit