Writing a Putchar in Assembly for X86_64 with 64 Bit Linux

Writing a putchar in Assembly for x86_64 with 64 bit Linux?

From man 2 write, you can see the signature of write is,

ssize_t write(int fd, const void *buf, size_t count);

It takes a pointer (const void *buf) to a buffer in memory. You can't pass it a char by value, so you have to store it to memory and pass a pointer.

(Don't print one char at a time unless you only have one to print, that's really inefficient. Construct a buffer in memory and print that. e.g. this x86-64 Linux NASM function: How do I print an integer in Assembly Level Programming without printf from the c library?)

A NASM version of GCC: putchar(char) in inline assembly:

; x86-64 System V calling convention: input = byte in DIL
; clobbers: RDI, RSI, RDX,  RCX, R11 (last 2 by syscall itself)
; returns:  RAX = write return value: 1 for success, -1..-4095 for error
writechar:
    mov    byte [rsp-4], dil      ; store the char from RDI

    mov     edi, 1                ; EDI = fd=1 = stdout
    lea     rsi, [rsp-4]          ; RSI = buf
    mov     edx, edi              ; RDX = len = 1
    syscall                    ; rax = write(1, buf, 1)
    ret

If you do pass an invalid pointer in RSI, such as '2' (integer 50), the system call will return -EFAULT (-14) in RAX. (The kernel returns error codes on bad pointers to system calls, instead of delivering a SIGSEGV like it would if you deref in user-space).

See also What are the return values of system calls in Assembly?

Instead of writing code to check return values, in toy programs / experiments you should just run them under strace ./a.out, especially if you're writing your own _start without libc there won't be any other system calls during startup that you don't make yourself, so it's very easy to read the output. How should strace be used?

GCC: putchar(char) in inline assembly

When using GNU C inline asm, use constraints to tell the compiler where you want things, instead of doing it "manually" with instructions inside the asm template.

For writechar and readchar, we only need a "syscall" as the template, with constraints to set up all the inputs in registers (and the pointed-to char in memory for the write(2) system call), according to the x86-64 Linux system-call convention (which very closely matches the System V ABI's function-calling convention). What are the calling conventions for UNIX & Linux system calls on i386 and x86-64.

This also makes it easy to avoid clobbering the red-zone (128 bytes below RSP), where the compiler might be keeping values. You must not clobber it from inline asm (so push / pop aren't usable unless you sub rsp, 128 first: see Using base pointer register in C++ inline asm for that and many useful links about GNU C inline asm), and there's no way to tell the compiler you clobber it. You could build with -mno-redzone, but in this case input/output operands are much better.

I'm hesitant to call these putchar and getchar. You can do that if you're implementing your own stdio that doesn't support buffering yet, but some functions require input buffering to implement correctly. For example, scanf has to examine characters to see if they match the format string, and leave them "unread" if they don't. Output buffering is optional, though; you could I think fully implement stdio with functions that create a private buffer and write() it, or directly write() their input pointer.

`writechar()`:

int writechar(char my_char)
{
    int retval;  // sys_write uses ssize_t, but we only pass len=1
                 // so the return value is either 1 on success or  -1..-4095 for error
                 // and thus fits in int

    asm volatile("syscall  #dummy arg picked %[dummy]\n"
                    : "=a" (retval)  /* output in EAX */
                    /* inputs: ssize_t read(int fd, const void *buf, size_t count); */
                    : "D"(1),         // RDI = fd=stdout
                      "S"(&my_char),  // RSI = buf
                      "d"(1)          // RDX = length
                      , [dummy]"m" (my_char) // dummy memory input, otherwise compiler doesn't store the arg
                    /* clobbered regs */
                    : "rcx", "r11"  // clobbered by syscall
                );
    // It doesn't matter what addressing mode "m"(my_char) picks,
    // as long as it refers to the same memory as &my_char so the compiler actually does a store

    return retval;
}

This compiles very efficiently with gcc -O3, on the Godbolt compiler explorer.

writechar:
    movb    %dil, -4(%rsp)        # store my_char into the red-zone
    movl    $1, %edi
    leaq    -4(%rsp), %rsi
    movl    %edi, %edx            # optimize because fd = len
    syscall               # dummy arg picked -4(%rsp)

    ret

@nrz's test main inlines it much more efficiently than the unsafe (red-zone clobbering) version in that answer, taking advantage of the fact that syscall leaves most registers unmodified so it can just set them once.

main:
    movl    $97, %r8d            # my_char = 'a'
    leaq    -1(%rsp), %rsi       # rsi = &my_char
    movl    $1, %edx             # len
.L6:                           # do {
    movb    %r8b, -1(%rsp)       # store the char into the buffer
    movl    %edx, %edi           # silly compiler doesn't hoist this out of the loop
    syscall  #dummy arg picked -1(%rsp)

    addl    $1, %r8d
    cmpb    $123, %r8b
    jne     .L6                # } while(++my_char < 'z'+1)

    movb    $10, -1(%rsp)
    syscall  #dummy arg picked -1(%rsp)

    xorl    %eax, %eax         # return 0
    ret

readchar(), done the same way:

int readchar(void)
{
    int retval;
    unsigned char my_char;
    asm volatile("syscall  #dummy arg picked %[dummy]\n"
                    /* outputs */
                    : "=a" (retval)
                     ,[dummy]"=m" (my_char) // tell the compiler the asm dereferences &my_char

                    /* inputs: ssize_t read(int fd, void *buf, size_t count); */
                    : "D"(0),         // RDI = fd=stdin
                      "S" (&my_char), // RDI = buf
                      "d"(1)          // RDX = length

                    : "rcx", "r11"  // clobbered by syscall
                );
    if (retval < 0)   // -1 .. -4095 are -errno values
        return retval;
    return my_char;   // else a 0..255 char / byte
}

Callers can check for error by checking c < 0.

GCC: putchar(char) in inline assembly

When using GNU C inline asm, use constraints to tell the compiler where you want things, instead of doing it "manually" with instructions inside the asm template.

`writechar()`:

int writechar(char my_char)
{
    int retval;  // sys_write uses ssize_t, but we only pass len=1
                 // so the return value is either 1 on success or  -1..-4095 for error
                 // and thus fits in int

    asm volatile("syscall  #dummy arg picked %[dummy]\n"
                    : "=a" (retval)  /* output in EAX */
                    /* inputs: ssize_t read(int fd, const void *buf, size_t count); */
                    : "D"(1),         // RDI = fd=stdout
                      "S"(&my_char),  // RSI = buf
                      "d"(1)          // RDX = length
                      , [dummy]"m" (my_char) // dummy memory input, otherwise compiler doesn't store the arg
                    /* clobbered regs */
                    : "rcx", "r11"  // clobbered by syscall
                );
    // It doesn't matter what addressing mode "m"(my_char) picks,
    // as long as it refers to the same memory as &my_char so the compiler actually does a store

    return retval;
}

This compiles very efficiently with gcc -O3, on the Godbolt compiler explorer.

writechar:
    movb    %dil, -4(%rsp)        # store my_char into the red-zone
    movl    $1, %edi
    leaq    -4(%rsp), %rsi
    movl    %edi, %edx            # optimize because fd = len
    syscall               # dummy arg picked -4(%rsp)

    ret

main:
    movl    $97, %r8d            # my_char = 'a'
    leaq    -1(%rsp), %rsi       # rsi = &my_char
    movl    $1, %edx             # len
.L6:                           # do {
    movb    %r8b, -1(%rsp)       # store the char into the buffer
    movl    %edx, %edi           # silly compiler doesn't hoist this out of the loop
    syscall  #dummy arg picked -1(%rsp)

    addl    $1, %r8d
    cmpb    $123, %r8b
    jne     .L6                # } while(++my_char < 'z'+1)

    movb    $10, -1(%rsp)
    syscall  #dummy arg picked -1(%rsp)

    xorl    %eax, %eax         # return 0
    ret

readchar(), done the same way:

int readchar(void)
{
    int retval;
    unsigned char my_char;
    asm volatile("syscall  #dummy arg picked %[dummy]\n"
                    /* outputs */
                    : "=a" (retval)
                     ,[dummy]"=m" (my_char) // tell the compiler the asm dereferences &my_char

                    /* inputs: ssize_t read(int fd, void *buf, size_t count); */
                    : "D"(0),         // RDI = fd=stdin
                      "S" (&my_char), // RDI = buf
                      "d"(1)          // RDX = length

                    : "rcx", "r11"  // clobbered by syscall
                );
    if (retval < 0)   // -1 .. -4095 are -errno values
        return retval;
    return my_char;   // else a 0..255 char / byte
}

Callers can check for error by checking c < 0.

Printing a character to standard output in Assembly x86

Sure, you can use any normal C function. Here's a NASM example that uses printf to print some output:

;
; assemble and link with:
; nasm -f elf test.asm && gcc -m32 -o test test.o
;
section .text

extern printf   ; If you need other functions, list them in a similar way

global main

main:

    mov eax, 0x21  ; The '!' character
    push eax
    push message
    call printf
    add esp, 8     ; Restore stack - 4 bytes for eax, and 4 bytes for 'message'
    ret

message db 'The character is: %c', 10, 0

If you only want to print a single character, you could use putchar:

push eax
call putchar

If you want to print out a number, you could do it like this:

mov ebx, 8
push ebx
push message
call printf
...    
message db 'The number is: %d', 10, 0

How to print a string to the terminal in x86-64 assembly (NASM) without syscall?

This is a good exercise. You will use syscall (you cannot access stdout otherwise), but you can do a "bare-metal" write without any external library providing the output routine (like calling printf). As an example of the basic "bare-metal" write to stdout in x86_64, I put together a example without any internal or system function calls:

section .data
    string1 db  0xa, "  Hello StackOverflow!!!", 0xa, 0xa, 0

section .text
    global _start

    _start:
        ; calculate the length of string
        mov     rdi, string1        ; string1 to destination index
        xor     rcx, rcx            ; zero rcx
        not     rcx                 ; set rcx = -1
        xor     al,al               ; zero the al register (initialize to NUL)
        cld                         ; clear the direction flag
        repnz   scasb               ; get the string length (dec rcx through NUL)
        not     rcx                 ; rev all bits of negative results in absolute value
        dec     rcx                 ; -1 to skip the null-terminator, rcx contains length
        mov     rdx, rcx            ; put length in rdx
        ; write string to stdout
        mov     rsi, string1        ; string1 to source index
        mov     rax, 1              ; set write to command
        mov     rdi,rax             ; set destination index to rax (stdout)
        syscall                     ; call kernel

        ; exit 
        xor     rdi,rdi             ; zero rdi (rdi hold return value)
        mov     rax, 0x3c           ; set syscall number to 60 (0x3c hex)
        syscall                     ; call kernel

; Compile/Link
;
; nasm -f elf64 -o hello-stack_64.o hello-stack_64.asm
; ld  -o hello-stack_64 hello-stack_64.o

output:

$ ./hello-stack_64

  Hello StackOverflow!!!

For general use, I split the process into two parts (1) getting the length and (2) writing to stdout. Below the strprn function will write any string to stdout. It calls strsz to get the length while preserving the destination index on the stack. This reduces the task of writing a string to stdout and prevents a lot of repitition in your code.

; szstr computes the lenght of a string.
; rdi - string address
; rdx - contains string length (returned)
section .text
        strsz:
                xor     rcx, rcx                ; zero rcx
                not     rcx                     ; set rcx = -1 (uses bitwise id: ~x = -x-1)
                xor     al,al                   ; zero the al register (initialize to NUL)
                cld                             ; clear the direction flag
                repnz scasb                     ; get the string length (dec rcx through NUL)
                not     rcx                     ; rev all bits of negative -> absolute value
                dec     rcx                     ; -1 to skip the null-term, rcx contains length
                mov     rdx, rcx                ; size returned in rdx, ready to call write
                ret

; strprn writes a string to the file descriptor.
; rdi - string address
; rdx - contains string length
section .text
        strprn:
                push    rdi                     ; push string address onto stack
                call    strsz                   ; call strsz to get length
                pop     rsi                     ; pop string to rsi (source index)
                mov     rax, 0x1                ; put write/stdout number in rax (both 1)
                mov     rdi, rax                ; set destination index to rax (stdout)
                syscall                         ; call kernel
                ret

To further automate general output to stdout NASM macros provide a convenient solution. Example strn (short for string_n). It takes two arguments, the addresses of the string, and the number of characters to write:

%macro  strn    2
        mov     rax, 1
        mov     rdi, 1
        mov     rsi, %1
        mov     rdx, %2
        syscall
%endmacro

Useful for indents, newlines or writing complete strings. You could generalize further by passing 3 arguments including the destination for rdi.

How to properly pass buffer pointers to Linux system calls in x86_64 assembly?

The actual problem is not with the buffer but with its length. Notice in the prototype you have socklen_t *addrlen so that should be a pointer. The value 15 that you pass is not a pointer hence the -EFAULT.

You should change the .length: equ $-ip_buff to ip_length: dd $-ip_buff and then use syscall getpeername,r12,ip_buff,ip_length

How to compare a char in a string with another char in NASM x86_64 Linux Assembly

rdi is a pointer, so cmp rdi, 0 checks for a null pointer. What you meant was cmp byte [rdi + rcx], 0 to check the end of string. Note you need to check the current character so have to add the index obviously.

As for cmp rsi, [byte rdi+rcx] the byte there makes no sense, since you are comparing the whole of rsi which is 8 bytes. That should be cmp sil, [rdi + rcx].

Finally, strchr is supposed to return a pointer so you should change mov rax, [rdi+rcx] to lea rax, [rdi + rcx].

Writing a Putchar in Assembly for X86_64 with 64 Bit Linux