Writing a Putchar in Assembly for X86_64 with 64 Bit Linux

Writing a putchar in Assembly for x86_64 with 64 bit Linux?

From man 2 write, you can see the signature of write is,

ssize_t write(int fd, const void *buf, size_t count);

It takes a pointer (const void *buf) to a buffer in memory. You can't pass it a char by value, so you have to store it to memory and pass a pointer.

(Don't print one char at a time unless you only have one to print, that's really inefficient. Construct a buffer in memory and print that. e.g. this x86-64 Linux NASM function: How do I print an integer in Assembly Level Programming without printf from the c library?)

A NASM version of GCC: putchar(char) in inline assembly:

; x86-64 System V calling convention: input = byte in DIL
; clobbers: RDI, RSI, RDX, RCX, R11 (last 2 by syscall itself)
; returns: RAX = write return value: 1 for success, -1..-4095 for error
writechar:
mov byte [rsp-4], dil ; store the char from RDI

mov edi, 1 ; EDI = fd=1 = stdout
lea rsi, [rsp-4] ; RSI = buf
mov edx, edi ; RDX = len = 1
syscall ; rax = write(1, buf, 1)
ret

If you do pass an invalid pointer in RSI, such as '2' (integer 50), the system call will return -EFAULT (-14) in RAX. (The kernel returns error codes on bad pointers to system calls, instead of delivering a SIGSEGV like it would if you deref in user-space).

See also What are the return values of system calls in Assembly?

Instead of writing code to check return values, in toy programs / experiments you should just run them under strace ./a.out, especially if you're writing your own _start without libc there won't be any other system calls during startup that you don't make yourself, so it's very easy to read the output. How should strace be used?

GCC: putchar(char) in inline assembly

When using GNU C inline asm, use constraints to tell the compiler where you want things, instead of doing it "manually" with instructions inside the asm template.

For writechar and readchar, we only need a "syscall" as the template, with constraints to set up all the inputs in registers (and the pointed-to char in memory for the write(2) system call), according to the x86-64 Linux system-call convention (which very closely matches the System V ABI's function-calling convention). What are the calling conventions for UNIX & Linux system calls on i386 and x86-64.

This also makes it easy to avoid clobbering the red-zone (128 bytes below RSP), where the compiler might be keeping values. You must not clobber it from inline asm (so push / pop aren't usable unless you sub rsp, 128 first: see Using base pointer register in C++ inline asm for that and many useful links about GNU C inline asm), and there's no way to tell the compiler you clobber it. You could build with -mno-redzone, but in this case input/output operands are much better.


I'm hesitant to call these putchar and getchar. You can do that if you're implementing your own stdio that doesn't support buffering yet, but some functions require input buffering to implement correctly. For example, scanf has to examine characters to see if they match the format string, and leave them "unread" if they don't. Output buffering is optional, though; you could I think fully implement stdio with functions that create a private buffer and write() it, or directly write() their input pointer.

writechar():

int writechar(char my_char)
{
int retval; // sys_write uses ssize_t, but we only pass len=1
// so the return value is either 1 on success or -1..-4095 for error
// and thus fits in int

asm volatile("syscall #dummy arg picked %[dummy]\n"
: "=a" (retval) /* output in EAX */
/* inputs: ssize_t read(int fd, const void *buf, size_t count); */
: "D"(1), // RDI = fd=stdout
"S"(&my_char), // RSI = buf
"d"(1) // RDX = length
, [dummy]"m" (my_char) // dummy memory input, otherwise compiler doesn't store the arg
/* clobbered regs */
: "rcx", "r11" // clobbered by syscall
);
// It doesn't matter what addressing mode "m"(my_char) picks,
// as long as it refers to the same memory as &my_char so the compiler actually does a store

return retval;
}

This compiles very efficiently with gcc -O3, on the Godbolt compiler explorer.

writechar:
movb %dil, -4(%rsp) # store my_char into the red-zone
movl $1, %edi
leaq -4(%rsp), %rsi
movl %edi, %edx # optimize because fd = len
syscall # dummy arg picked -4(%rsp)

ret

@nrz's test main inlines it much more efficiently than the unsafe (red-zone clobbering) version in that answer, taking advantage of the fact that syscall leaves most registers unmodified so it can just set them once.

main:
movl $97, %r8d # my_char = 'a'
leaq -1(%rsp), %rsi # rsi = &my_char
movl $1, %edx # len
.L6: # do {
movb %r8b, -1(%rsp) # store the char into the buffer
movl %edx, %edi # silly compiler doesn't hoist this out of the loop
syscall #dummy arg picked -1(%rsp)

addl $1, %r8d
cmpb $123, %r8b
jne .L6 # } while(++my_char < 'z'+1)

movb $10, -1(%rsp)
syscall #dummy arg picked -1(%rsp)

xorl %eax, %eax # return 0
ret

readchar(), done the same way:

int readchar(void)
{
int retval;
unsigned char my_char;
asm volatile("syscall #dummy arg picked %[dummy]\n"
/* outputs */
: "=a" (retval)
,[dummy]"=m" (my_char) // tell the compiler the asm dereferences &my_char

/* inputs: ssize_t read(int fd, void *buf, size_t count); */
: "D"(0), // RDI = fd=stdin
"S" (&my_char), // RDI = buf
"d"(1) // RDX = length

: "rcx", "r11" // clobbered by syscall
);
if (retval < 0) // -1 .. -4095 are -errno values
return retval;
return my_char; // else a 0..255 char / byte
}

Callers can check for error by checking c < 0.

GCC: putchar(char) in inline assembly

When using GNU C inline asm, use constraints to tell the compiler where you want things, instead of doing it "manually" with instructions inside the asm template.

For writechar and readchar, we only need a "syscall" as the template, with constraints to set up all the inputs in registers (and the pointed-to char in memory for the write(2) system call), according to the x86-64 Linux system-call convention (which very closely matches the System V ABI's function-calling convention). What are the calling conventions for UNIX & Linux system calls on i386 and x86-64.

This also makes it easy to avoid clobbering the red-zone (128 bytes below RSP), where the compiler might be keeping values. You must not clobber it from inline asm (so push / pop aren't usable unless you sub rsp, 128 first: see Using base pointer register in C++ inline asm for that and many useful links about GNU C inline asm), and there's no way to tell the compiler you clobber it. You could build with -mno-redzone, but in this case input/output operands are much better.


I'm hesitant to call these putchar and getchar. You can do that if you're implementing your own stdio that doesn't support buffering yet, but some functions require input buffering to implement correctly. For example, scanf has to examine characters to see if they match the format string, and leave them "unread" if they don't. Output buffering is optional, though; you could I think fully implement stdio with functions that create a private buffer and write() it, or directly write() their input pointer.

writechar():

int writechar(char my_char)
{
int retval; // sys_write uses ssize_t, but we only pass len=1
// so the return value is either 1 on success or -1..-4095 for error
// and thus fits in int

asm volatile("syscall #dummy arg picked %[dummy]\n"
: "=a" (retval) /* output in EAX */
/* inputs: ssize_t read(int fd, const void *buf, size_t count); */
: "D"(1), // RDI = fd=stdout
"S"(&my_char), // RSI = buf
"d"(1) // RDX = length
, [dummy]"m" (my_char) // dummy memory input, otherwise compiler doesn't store the arg
/* clobbered regs */
: "rcx", "r11" // clobbered by syscall
);
// It doesn't matter what addressing mode "m"(my_char) picks,
// as long as it refers to the same memory as &my_char so the compiler actually does a store

return retval;
}

This compiles very efficiently with gcc -O3, on the Godbolt compiler explorer.

writechar:
movb %dil, -4(%rsp) # store my_char into the red-zone
movl $1, %edi
leaq -4(%rsp), %rsi
movl %edi, %edx # optimize because fd = len
syscall # dummy arg picked -4(%rsp)

ret

@nrz's test main inlines it much more efficiently than the unsafe (red-zone clobbering) version in that answer, taking advantage of the fact that syscall leaves most registers unmodified so it can just set them once.

main:
movl $97, %r8d # my_char = 'a'
leaq -1(%rsp), %rsi # rsi = &my_char
movl $1, %edx # len
.L6: # do {
movb %r8b, -1(%rsp) # store the char into the buffer
movl %edx, %edi # silly compiler doesn't hoist this out of the loop
syscall #dummy arg picked -1(%rsp)

addl $1, %r8d
cmpb $123, %r8b
jne .L6 # } while(++my_char < 'z'+1)

movb $10, -1(%rsp)
syscall #dummy arg picked -1(%rsp)

xorl %eax, %eax # return 0
ret

readchar(), done the same way:

int readchar(void)
{
int retval;
unsigned char my_char;
asm volatile("syscall #dummy arg picked %[dummy]\n"
/* outputs */
: "=a" (retval)
,[dummy]"=m" (my_char) // tell the compiler the asm dereferences &my_char

/* inputs: ssize_t read(int fd, void *buf, size_t count); */
: "D"(0), // RDI = fd=stdin
"S" (&my_char), // RDI = buf
"d"(1) // RDX = length

: "rcx", "r11" // clobbered by syscall
);
if (retval < 0) // -1 .. -4095 are -errno values
return retval;
return my_char; // else a 0..255 char / byte
}

Callers can check for error by checking c < 0.

Printing a character to standard output in Assembly x86

Sure, you can use any normal C function. Here's a NASM example that uses printf to print some output:

;
; assemble and link with:
; nasm -f elf test.asm && gcc -m32 -o test test.o
;
section .text

extern printf ; If you need other functions, list them in a similar way

global main

main:

mov eax, 0x21 ; The '!' character
push eax
push message
call printf
add esp, 8 ; Restore stack - 4 bytes for eax, and 4 bytes for 'message'
ret

message db 'The character is: %c', 10, 0

If you only want to print a single character, you could use putchar:

push eax
call putchar

If you want to print out a number, you could do it like this:

mov ebx, 8
push ebx
push message
call printf
...
message db 'The number is: %d', 10, 0

How to print a string to the terminal in x86-64 assembly (NASM) without syscall?

This is a good exercise. You will use syscall (you cannot access stdout otherwise), but you can do a "bare-metal" write without any external library providing the output routine (like calling printf). As an example of the basic "bare-metal" write to stdout in x86_64, I put together a example without any internal or system function calls:

section .data
string1 db 0xa, " Hello StackOverflow!!!", 0xa, 0xa, 0

section .text
global _start

_start:
; calculate the length of string
mov rdi, string1 ; string1 to destination index
xor rcx, rcx ; zero rcx
not rcx ; set rcx = -1
xor al,al ; zero the al register (initialize to NUL)
cld ; clear the direction flag
repnz scasb ; get the string length (dec rcx through NUL)
not rcx ; rev all bits of negative results in absolute value
dec rcx ; -1 to skip the null-terminator, rcx contains length
mov rdx, rcx ; put length in rdx
; write string to stdout
mov rsi, string1 ; string1 to source index
mov rax, 1 ; set write to command
mov rdi,rax ; set destination index to rax (stdout)
syscall ; call kernel

; exit
xor rdi,rdi ; zero rdi (rdi hold return value)
mov rax, 0x3c ; set syscall number to 60 (0x3c hex)
syscall ; call kernel

; Compile/Link
;
; nasm -f elf64 -o hello-stack_64.o hello-stack_64.asm
; ld -o hello-stack_64 hello-stack_64.o

output:

$ ./hello-stack_64

Hello StackOverflow!!!

For general use, I split the process into two parts (1) getting the length and (2) writing to stdout. Below the strprn function will write any string to stdout. It calls strsz to get the length while preserving the destination index on the stack. This reduces the task of writing a string to stdout and prevents a lot of repitition in your code.

; szstr computes the lenght of a string.
; rdi - string address
; rdx - contains string length (returned)
section .text
strsz:
xor rcx, rcx ; zero rcx
not rcx ; set rcx = -1 (uses bitwise id: ~x = -x-1)
xor al,al ; zero the al register (initialize to NUL)
cld ; clear the direction flag
repnz scasb ; get the string length (dec rcx through NUL)
not rcx ; rev all bits of negative -> absolute value
dec rcx ; -1 to skip the null-term, rcx contains length
mov rdx, rcx ; size returned in rdx, ready to call write
ret

; strprn writes a string to the file descriptor.
; rdi - string address
; rdx - contains string length
section .text
strprn:
push rdi ; push string address onto stack
call strsz ; call strsz to get length
pop rsi ; pop string to rsi (source index)
mov rax, 0x1 ; put write/stdout number in rax (both 1)
mov rdi, rax ; set destination index to rax (stdout)
syscall ; call kernel
ret

To further automate general output to stdout NASM macros provide a convenient solution. Example strn (short for string_n). It takes two arguments, the addresses of the string, and the number of characters to write:

%macro  strn    2
mov rax, 1
mov rdi, 1
mov rsi, %1
mov rdx, %2
syscall
%endmacro

Useful for indents, newlines or writing complete strings. You could generalize further by passing 3 arguments including the destination for rdi.

How to properly pass buffer pointers to Linux system calls in x86_64 assembly?

The actual problem is not with the buffer but with its length. Notice in the prototype you have socklen_t *addrlen so that should be a pointer. The value 15 that you pass is not a pointer hence the -EFAULT.

You should change the .length: equ $-ip_buff to ip_length: dd $-ip_buff and then use syscall getpeername,r12,ip_buff,ip_length

How to compare a char in a string with another char in NASM x86_64 Linux Assembly

rdi is a pointer, so cmp rdi, 0 checks for a null pointer. What you meant was cmp byte [rdi + rcx], 0 to check the end of string. Note you need to check the current character so have to add the index obviously.

As for cmp rsi, [byte rdi+rcx] the byte there makes no sense, since you are comparing the whole of rsi which is 8 bytes. That should be cmp sil, [rdi + rcx].

Finally, strchr is supposed to return a pointer so you should change mov rax, [rdi+rcx] to lea rax, [rdi + rcx].



Related Topics



Leave a reply



Submit