Calling Printf in X86_64 Using Gnu Assembler

Calling printf in x86_64 using GNU assembler

There are a number of issues with this code. The AMD64 System V ABI calling convention used by Linux requires a few things. It requires that just before a CALL that the stack be at least 16-byte (or 32-byte) aligned:

The end of the input argument area shall be aligned on a 16 (32, if __m256 is
passed on stack) byte boundary.

After the C runtime calls your main function the stack is misaligned by 8 because the return pointer was placed on the stack by CALL. To realign to 16-byte boundary you can simply PUSH any general purpose register onto the stack and POP it off at the end.

The calling convention also requires that AL contain the number of vector registers used for a variable argument function:

%al is used to indicate the number of vector arguments passed to a function requiring a variable number of arguments

printf is a variable argument function, so AL needs to be set. In this case you don't pass any parameters in a vector register so you can set AL to 0.

You also dereference the $format pointer when it is already an address. So this is wrong:

mov  $format, %rbx
mov (%rbx), %rdi

This takes the address of format and places it in RBX. Then you take the 8 bytes at that address in RBX and place them in RDI. RDI needs to be a pointer to a string of characters, not the characters themselves. The two lines could be replaced with:

lea  format(%rip), %rdi

This uses RIP Relative Addressing.

You should also NUL terminate your strings. Rather than use .ascii you can use .asciz on the x86 platform.

A working version of your program could look like:

# global data  #
.data
format: .asciz "%d\n"
.text
.global main
main:
push %rbx
lea format(%rip), %rdi
mov $1, %esi # Writing to ESI zero extends to RSI.
xor %eax, %eax # Zeroing EAX is efficient way to clear AL.
call printf
pop %rbx
ret


Other Recommendations/Suggestions

You should also be aware from the 64-bit Linux ABI, that the calling convention also requires functions you write to honor the preservation of certain registers. The list of registers and whether they should preserved is as follows:

Sample Image

Any register that says Yes in the Preserved across
function calls
column are ones you must ensure are preserved across your function. Function main is like any other C function.


If you have strings/data that you know will be read only you can place them in the .rodata section with .section .rodata rather than .data


In 64-bit mode: if you have a destination operand that is a 32-bit register, the CPU will zero extend the register across the entire 64-bit register. This can save bytes on the instruction encoding.


It is possible your executable is being compiled as position independent code. You may receive an error similar to:

relocation R_X86_64_PC32 against symbol `printf@@GLIBC_2.2.5' can not be used when making a shared object; recompile with -fPIC

To fix this you'll have to call the external function printf this way:

call printf@plt 

This calls the external library function via the Procedure Linkage Table (PLT)

gnu assembler printf error

Change both how you assemble and link the program and how you're pushing $par1 instead of par1:

.section .data
n:
.int 33
fmt:
.asciz "n: %d\n"
.section .text
.global _start
_start:
pushl n
pushl $fmt
call printf

movl $1, %eax
movl $0, %ebx
int $0x80

Assemble and link with:

cc -nostdlib -Os -Wall -g3 -m32 -lc printf-x86.S -o printf-x86

cc here is just an alias for gcc. The regular compiler driver would know the right options to pass to as and ld plus it means your assembler source (.S) gets passed through the C preprocessor and you can use header files like <sys/sdt.h>.

Here's a GNU make fragment in case you need it:

%: %.S
$(CC) -nostdlib $(CFLAGS) $(LDFLAGS) $< -o $@
printf-x86: CFLAGS+=-m32
printf-x86: LDFLAGS+=-lc

Printing a number in assembly NASM using printf

Apparently you don't even know how printf works which makes it hard to invoke it from assembly.

To print a number, printf expects two arguments, a format string and the number to print of course. Example: printf("%d\n", 12345).

Now to turn that into assembly, you obviously need to declare that format string, and then pass both arguments using the appropriate convention.

Since you seem to be using sysv abi, this means the first two arguments go into rdi and rsi, respectively. You already seem to know you have to zero al to indicate no SSE registers used. As such, the relevant part could look like:

lea rdi, [rel fmt]
mov rsi, 12345 ; or mov rsi, [count]
mov al, 0
call printf
...
fmt: db "%d", 0x0a, 0

GNU as, puts works but printf does not

puts appends a newline implicitly, and stdout is line-buffered (by default on terminals). So the text from printf may just be sitting there in the buffer. Your call to _exit(2) doesn't flush buffers, because it's the exit_group(2) system call, not the exit(3) library function. (See my version of your code below).

Your call to printf(3) is also not quite right, because you didn't zero %al before calling a var-args function with no FP arguments. (Good catch @RossRidge, I missed that). xor %eax,%eax is the best way to do that. %al will be non-zero (from puts()'s return value), which is presumably why printf segfaults. I tested on my system, and printf doesn't seem to mind when the stack is misaligned (which it is, since you pushed twice before calling it, unlike puts).


Also, you don't need any push instructions in that code. The first arg goes in %rdi. The first 6 integer args go in registers, the 7th and later go on the stack. You're also neglecting to pop the stack after the functions return, which only works because your function never tries to return after messing up the stack.

The ABI does require aligning the stack by 16B, and a push is one way to do that, which can actually be more efficient than sub $8, %rsp on recent Intel CPUs with a stack engine, and it takes fewer bytes. (See the x86-64 SysV ABI, and other links in the x86 tag wiki).


Improved code:

.text
.global main
main:
lea message, %rdi # or mov $message, %edi if you don't need the code to be position-independent: default code model has all labels in the low 2G, so you can use shorter 32bit instructions
push %rbx # align the stack for another call
mov %rdi, %rbx # save for later
call puts

xor %eax,%eax # %al = 0 = number of FP args for var-args functions
mov %rbx, %rdi # or mov %ebx, %edi will normally be safe, since the pointer is known to be pointing to static storage, which will be in the low 2G
call printf

# optionally putchar a '\n', or include it in the string you pass to printf

#xor %edi,%edi # exit with 0 status
#call exit # exit(3) does an fflush and other cleanup

pop %rbx # restore caller's rbx, and restore the stack

xor %eax,%eax # return 0
ret

.section .rodata # constants should go in .rodata
message: .asciz "Hello, World!"

lea message, %rdi is cheap, and doing it twice is fewer instructions than the two mov instructions to make use of %rbx. But since we needed to adjust the stack by 8B to strictly follow the ABI's 16B-aligned guarantee, we might as well do it by saving a call-preserved register. mov reg,reg is very cheap and small, so taking advantage of the call-preserved reg is natural.

Using mov %edi, %ebx and stuff like that saves the REX prefix in the machine-code encoding. If you're not sure / don't understand why it's safe to only copy the low 32bits, zeroing the upper 32b, then use 64bit registers. Once you understand what's going on, you'll know when you can save machine-code bytes by using 32bit operand-size.

Linux X86-64 assembly and printf

Linux (and Windows) x86-64 calling convention has the first few arguments not on the stack, but in registers instead

See http://www.x86-64.org/documentation/abi.pdf (page 20)

Specifically:

  1. If the class is MEMORY, pass the argument on the stack.
  2. If the class is INTEGER, the next available register of the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9 is used.
  3. If the class is SSE, the next available vector register is used, the registers are taken in the order from %xmm0 to %xmm7.
  4. If the class is SSEUP, the eightbyte is passed in the next available eightbyte chunk of the last used vector register.
  5. If the class is X87, X87UP or COMPLEX_X87, it is passed in memory.

The INTEGER class is anything that will fit in a general purpose register, so that's what you would use for string pointers as well.



Related Topics



Leave a reply



Submit