Writing a putchar in Assembly for x86_64 with 64 bit Linux?
From man 2 write
, you can see the signature of write
is,
ssize_t write(int fd, const void *buf, size_t count);
It takes a pointer (const void *buf
) to a buffer in memory. You can't pass it a char
by value, so you have to store it to memory and pass a pointer.
(Don't print one char at a time unless you only have one to print, that's really inefficient. Construct a buffer in memory and print that. e.g. this x86-64 Linux NASM function: How do I print an integer in Assembly Level Programming without printf from the c library?)
A NASM version of GCC: putchar(char) in inline assembly:
; x86-64 System V calling convention: input = byte in DIL
; clobbers: RDI, RSI, RDX, RCX, R11 (last 2 by syscall itself)
; returns: RAX = write return value: 1 for success, -1..-4095 for error
writechar:
mov byte [rsp-4], dil ; store the char from RDI
mov edi, 1 ; EDI = fd=1 = stdout
lea rsi, [rsp-4] ; RSI = buf
mov edx, edi ; RDX = len = 1
syscall ; rax = write(1, buf, 1)
ret
If you do pass an invalid pointer in RSI, such as '2'
(integer 50
), the system call will return -EFAULT
(-14
) in RAX. (The kernel returns error codes on bad pointers to system calls, instead of delivering a SIGSEGV like it would if you deref in user-space).
See also What are the return values of system calls in Assembly?
Instead of writing code to check return values, in toy programs / experiments you should just run them under strace ./a.out
, especially if you're writing your own _start
without libc there won't be any other system calls during startup that you don't make yourself, so it's very easy to read the output. How should strace be used?
GCC: putchar(char) in inline assembly
When using GNU C inline asm, use constraints to tell the compiler where you want things, instead of doing it "manually" with instructions inside the asm template.
For writechar
and readchar
, we only need a "syscall"
as the template, with constraints to set up all the inputs in registers (and the pointed-to char
in memory for the write(2)
system call), according to the x86-64 Linux system-call convention (which very closely matches the System V ABI's function-calling convention). What are the calling conventions for UNIX & Linux system calls on i386 and x86-64.
This also makes it easy to avoid clobbering the red-zone (128 bytes below RSP), where the compiler might be keeping values. You must not clobber it from inline asm (so push
/ pop
aren't usable unless you sub rsp, 128
first: see Using base pointer register in C++ inline asm for that and many useful links about GNU C inline asm), and there's no way to tell the compiler you clobber it. You could build with -mno-redzone
, but in this case input/output operands are much better.
I'm hesitant to call these putchar
and getchar
. You can do that if you're implementing your own stdio that doesn't support buffering yet, but some functions require input buffering to implement correctly. For example, scanf
has to examine characters to see if they match the format string, and leave them "unread" if they don't. Output buffering is optional, though; you could I think fully implement stdio with functions that create a private buffer and write()
it, or directly write()
their input pointer.
writechar()
:
int writechar(char my_char)
{
int retval; // sys_write uses ssize_t, but we only pass len=1
// so the return value is either 1 on success or -1..-4095 for error
// and thus fits in int
asm volatile("syscall #dummy arg picked %[dummy]\n"
: "=a" (retval) /* output in EAX */
/* inputs: ssize_t read(int fd, const void *buf, size_t count); */
: "D"(1), // RDI = fd=stdout
"S"(&my_char), // RSI = buf
"d"(1) // RDX = length
, [dummy]"m" (my_char) // dummy memory input, otherwise compiler doesn't store the arg
/* clobbered regs */
: "rcx", "r11" // clobbered by syscall
);
// It doesn't matter what addressing mode "m"(my_char) picks,
// as long as it refers to the same memory as &my_char so the compiler actually does a store
return retval;
}
This compiles very efficiently with gcc -O3, on the Godbolt compiler explorer.
writechar:
movb %dil, -4(%rsp) # store my_char into the red-zone
movl $1, %edi
leaq -4(%rsp), %rsi
movl %edi, %edx # optimize because fd = len
syscall # dummy arg picked -4(%rsp)
ret
@nrz's test main
inlines it much more efficiently than the unsafe (red-zone clobbering) version in that answer, taking advantage of the fact that syscall
leaves most registers unmodified so it can just set them once.
main:
movl $97, %r8d # my_char = 'a'
leaq -1(%rsp), %rsi # rsi = &my_char
movl $1, %edx # len
.L6: # do {
movb %r8b, -1(%rsp) # store the char into the buffer
movl %edx, %edi # silly compiler doesn't hoist this out of the loop
syscall #dummy arg picked -1(%rsp)
addl $1, %r8d
cmpb $123, %r8b
jne .L6 # } while(++my_char < 'z'+1)
movb $10, -1(%rsp)
syscall #dummy arg picked -1(%rsp)
xorl %eax, %eax # return 0
ret
readchar(), done the same way:
int readchar(void)
{
int retval;
unsigned char my_char;
asm volatile("syscall #dummy arg picked %[dummy]\n"
/* outputs */
: "=a" (retval)
,[dummy]"=m" (my_char) // tell the compiler the asm dereferences &my_char
/* inputs: ssize_t read(int fd, void *buf, size_t count); */
: "D"(0), // RDI = fd=stdin
"S" (&my_char), // RDI = buf
"d"(1) // RDX = length
: "rcx", "r11" // clobbered by syscall
);
if (retval < 0) // -1 .. -4095 are -errno values
return retval;
return my_char; // else a 0..255 char / byte
}
Callers can check for error by checking c < 0
.
GCC: putchar(char) in inline assembly
When using GNU C inline asm, use constraints to tell the compiler where you want things, instead of doing it "manually" with instructions inside the asm template.
For writechar
and readchar
, we only need a "syscall"
as the template, with constraints to set up all the inputs in registers (and the pointed-to char
in memory for the write(2)
system call), according to the x86-64 Linux system-call convention (which very closely matches the System V ABI's function-calling convention). What are the calling conventions for UNIX & Linux system calls on i386 and x86-64.
This also makes it easy to avoid clobbering the red-zone (128 bytes below RSP), where the compiler might be keeping values. You must not clobber it from inline asm (so push
/ pop
aren't usable unless you sub rsp, 128
first: see Using base pointer register in C++ inline asm for that and many useful links about GNU C inline asm), and there's no way to tell the compiler you clobber it. You could build with -mno-redzone
, but in this case input/output operands are much better.
I'm hesitant to call these putchar
and getchar
. You can do that if you're implementing your own stdio that doesn't support buffering yet, but some functions require input buffering to implement correctly. For example, scanf
has to examine characters to see if they match the format string, and leave them "unread" if they don't. Output buffering is optional, though; you could I think fully implement stdio with functions that create a private buffer and write()
it, or directly write()
their input pointer.
writechar()
:
int writechar(char my_char)
{
int retval; // sys_write uses ssize_t, but we only pass len=1
// so the return value is either 1 on success or -1..-4095 for error
// and thus fits in int
asm volatile("syscall #dummy arg picked %[dummy]\n"
: "=a" (retval) /* output in EAX */
/* inputs: ssize_t read(int fd, const void *buf, size_t count); */
: "D"(1), // RDI = fd=stdout
"S"(&my_char), // RSI = buf
"d"(1) // RDX = length
, [dummy]"m" (my_char) // dummy memory input, otherwise compiler doesn't store the arg
/* clobbered regs */
: "rcx", "r11" // clobbered by syscall
);
// It doesn't matter what addressing mode "m"(my_char) picks,
// as long as it refers to the same memory as &my_char so the compiler actually does a store
return retval;
}
This compiles very efficiently with gcc -O3, on the Godbolt compiler explorer.
writechar:
movb %dil, -4(%rsp) # store my_char into the red-zone
movl $1, %edi
leaq -4(%rsp), %rsi
movl %edi, %edx # optimize because fd = len
syscall # dummy arg picked -4(%rsp)
ret
@nrz's test main
inlines it much more efficiently than the unsafe (red-zone clobbering) version in that answer, taking advantage of the fact that syscall
leaves most registers unmodified so it can just set them once.
main:
movl $97, %r8d # my_char = 'a'
leaq -1(%rsp), %rsi # rsi = &my_char
movl $1, %edx # len
.L6: # do {
movb %r8b, -1(%rsp) # store the char into the buffer
movl %edx, %edi # silly compiler doesn't hoist this out of the loop
syscall #dummy arg picked -1(%rsp)
addl $1, %r8d
cmpb $123, %r8b
jne .L6 # } while(++my_char < 'z'+1)
movb $10, -1(%rsp)
syscall #dummy arg picked -1(%rsp)
xorl %eax, %eax # return 0
ret
readchar(), done the same way:
int readchar(void)
{
int retval;
unsigned char my_char;
asm volatile("syscall #dummy arg picked %[dummy]\n"
/* outputs */
: "=a" (retval)
,[dummy]"=m" (my_char) // tell the compiler the asm dereferences &my_char
/* inputs: ssize_t read(int fd, void *buf, size_t count); */
: "D"(0), // RDI = fd=stdin
"S" (&my_char), // RDI = buf
"d"(1) // RDX = length
: "rcx", "r11" // clobbered by syscall
);
if (retval < 0) // -1 .. -4095 are -errno values
return retval;
return my_char; // else a 0..255 char / byte
}
Callers can check for error by checking c < 0
.
Printing a character to standard output in Assembly x86
Sure, you can use any normal C function. Here's a NASM example that uses printf to print some output:
;
; assemble and link with:
; nasm -f elf test.asm && gcc -m32 -o test test.o
;
section .text
extern printf ; If you need other functions, list them in a similar way
global main
main:
mov eax, 0x21 ; The '!' character
push eax
push message
call printf
add esp, 8 ; Restore stack - 4 bytes for eax, and 4 bytes for 'message'
ret
message db 'The character is: %c', 10, 0
If you only want to print a single character, you could use putchar:
push eax
call putchar
If you want to print out a number, you could do it like this:
mov ebx, 8
push ebx
push message
call printf
...
message db 'The number is: %d', 10, 0
How to print a string to the terminal in x86-64 assembly (NASM) without syscall?
This is a good exercise. You will use syscall
(you cannot access stdout
otherwise), but you can do a "bare-metal" write without any external library providing the output routine (like calling printf
). As an example of the basic "bare-metal" write to stdout
in x86_64, I put together a example without any internal or system function calls:
section .data
string1 db 0xa, " Hello StackOverflow!!!", 0xa, 0xa, 0
section .text
global _start
_start:
; calculate the length of string
mov rdi, string1 ; string1 to destination index
xor rcx, rcx ; zero rcx
not rcx ; set rcx = -1
xor al,al ; zero the al register (initialize to NUL)
cld ; clear the direction flag
repnz scasb ; get the string length (dec rcx through NUL)
not rcx ; rev all bits of negative results in absolute value
dec rcx ; -1 to skip the null-terminator, rcx contains length
mov rdx, rcx ; put length in rdx
; write string to stdout
mov rsi, string1 ; string1 to source index
mov rax, 1 ; set write to command
mov rdi,rax ; set destination index to rax (stdout)
syscall ; call kernel
; exit
xor rdi,rdi ; zero rdi (rdi hold return value)
mov rax, 0x3c ; set syscall number to 60 (0x3c hex)
syscall ; call kernel
; Compile/Link
;
; nasm -f elf64 -o hello-stack_64.o hello-stack_64.asm
; ld -o hello-stack_64 hello-stack_64.o
output:
$ ./hello-stack_64
Hello StackOverflow!!!
For general use, I split the process into two parts (1) getting the length and (2) writing to stdout
. Below the strprn
function will write any string to stdout
. It calls strsz
to get the length while preserving the destination index on the stack. This reduces the task of writing a string to stdout
and prevents a lot of repitition in your code.
; szstr computes the lenght of a string.
; rdi - string address
; rdx - contains string length (returned)
section .text
strsz:
xor rcx, rcx ; zero rcx
not rcx ; set rcx = -1 (uses bitwise id: ~x = -x-1)
xor al,al ; zero the al register (initialize to NUL)
cld ; clear the direction flag
repnz scasb ; get the string length (dec rcx through NUL)
not rcx ; rev all bits of negative -> absolute value
dec rcx ; -1 to skip the null-term, rcx contains length
mov rdx, rcx ; size returned in rdx, ready to call write
ret
; strprn writes a string to the file descriptor.
; rdi - string address
; rdx - contains string length
section .text
strprn:
push rdi ; push string address onto stack
call strsz ; call strsz to get length
pop rsi ; pop string to rsi (source index)
mov rax, 0x1 ; put write/stdout number in rax (both 1)
mov rdi, rax ; set destination index to rax (stdout)
syscall ; call kernel
ret
To further automate general output to stdout
NASM macros provide a convenient solution. Example strn
(short for string_n
). It takes two arguments, the addresses of the string, and the number of characters to write:
%macro strn 2
mov rax, 1
mov rdi, 1
mov rsi, %1
mov rdx, %2
syscall
%endmacro
Useful for indents, newlines or writing complete strings. You could generalize further by passing 3 arguments including the destination for rdi
.
How to properly pass buffer pointers to Linux system calls in x86_64 assembly?
The actual problem is not with the buffer but with its length. Notice in the prototype you have socklen_t *addrlen
so that should be a pointer. The value 15
that you pass is not a pointer hence the -EFAULT
.
You should change the .length: equ $-ip_buff
to ip_length: dd $-ip_buff
and then use syscall getpeername,r12,ip_buff,ip_length
How to compare a char in a string with another char in NASM x86_64 Linux Assembly
rdi
is a pointer, so cmp rdi, 0
checks for a null pointer. What you meant was cmp byte [rdi + rcx], 0
to check the end of string. Note you need to check the current character so have to add the index obviously.
As for cmp rsi, [byte rdi+rcx]
the byte
there makes no sense, since you are comparing the whole of rsi
which is 8 bytes. That should be cmp sil, [rdi + rcx]
.
Finally, strchr
is supposed to return a pointer so you should change mov rax, [rdi+rcx]
to lea rax, [rdi + rcx]
.
Related Topics
Merge PDF's with PDFtk with Bookmarks
Difference Between Posix Aio and Libaio on Linux
How to Execute Ssh-Keygen Without Prompt
Bash: Inserting a Line in a File at a Specific Location
Why Doesn't Linux Use the Hardware Context Switch via the Tss
What Is the Purpose of Map_Anonymous Flag in Mmap System Call
Unix Standard Directory to Put Custom Executables or Scripts
How to Configure Linux Capabilities Per User
How to Get the Difference (Only Additions) Between Two Files in Linux
Sed with Literal String--Not Input File
Why Child Process Still Alive After Parent Process Was Killed in Linux
Differencebetween -I and -L in Makefile
How Stable Is S3Fs to Mount an Amazon S3 Bucket as a Local Directory
Get a Browser Rendered HTML+Javascript
Number of Executed Instructions Different for Hello World Program Nasm Assembly and C