Reading Input from Keyboard with X64 Linux Syscalls (Assembly)

Reading input from keyboard with x64 linux syscalls (assembly)

it automatically prints on screen the keys

This is the default setting in Linux (independent of the programming language):

  • Keyboard input is printed to the screen
  • sys_read will wait until the return (enter) key is pressed

To change this behaviour the tcsetattr() function (in C) must be called. You should call the tcgetattr() function before to store the current settings and restore them before leaving the program.

If you want to use system calls directly: tcsetattr and tcgetattr both use some sys_ioctl. To find out which ioctl() code is used you may write a C program doing tcsetattr and tcgetattr and use "strace" to find out which syscalls are called.

it doesn't exit when i press esc

There are three problems in the file:

  1. As far as I understand correctly you read two bytes - which means two keystrokes - whenever you call sys_read
  2. sys_read will wait until the return key is pressed (see above)
  3. You compare a 64-bit value to a piece of memory that is only one (or two) byte(s) long.

You should read only one byte using sys_read. Then you should do a bytewise compare instead of a 64-bit compare:

cmp bl,key

instead of:

cmp rbx,key

Reading a single-key input on Linux (without waiting for return) using x86_64 sys_call

Syscalls in 64-bit linux

The tables from man syscall provide a good overview here:

arch/ABI   instruction          syscall #   retval Notes
──────────────────────────────────────────────────────────────────
i386 int $0x80 eax eax
x86_64 syscall rax rax See below

arch/ABI arg1 arg2 arg3 arg4 arg5 arg6 arg7 Notes
──────────────────────────────────────────────────────────────────
i386 ebx ecx edx esi edi ebp -
x86_64 rdi rsi rdx r10 r8 r9 -

I have omitted the lines that are not relevant here. In 32-bit mode, the parameters were transferred in ebx, ecx, etc and the syscall number is in eax. In 64-bit mode it is a little different: All registers are now 64-bit wide and therefore have a different name. The syscall number is still in eax, which now becomes rax. But the parameters are now passed in rdi, rsi, etc. In addition, the instruction syscall is used here instead of int 0x80 to trigger a syscall.

The order of the parameters can also be read in the man pages, here man 2 ioctl and man 2 read:

int ioctl(int fd, unsigned long request, ...);
ssize_t read(int fd, void *buf, size_t count);

So here the value of int fd is in rdi, the second parameter in rsi etc.

How to get rid of waiting for a newline

Firstly create a termios structure in memory (in .bss section):

termios:
c_iflag resd 1 ; input mode flags
c_oflag resd 1 ; output mode flags
c_cflag resd 1 ; control mode flags
c_lflag resd 1 ; local mode flags
c_line resb 1 ; line discipline
c_cc resb 19 ; control characters

Then get the current terminal settings and disable canonical mode:

; Get current settings
mov eax, 16 ; syscall number: SYS_ioctl
mov edi, 0 ; fd: STDIN_FILENO
mov esi, 0x5401 ; request: TCGETS
mov rdx, termios ; request data
syscall

; Modify flags
and byte [c_lflag], 0FDh ; Clear ICANON to disable canonical mode

; Write termios structure back
mov eax, 16 ; syscall number: SYS_ioctl
mov edi, 0 ; fd: STDIN_FILENO
mov esi, 0x5402 ; request: TCSETS
mov rdx, termios ; request data
syscall

Now you can use sys_read to read in the keystroke:

mov  eax, 0              ; syscall number: SYS_read
mov edi, 0 ; int fd: STDIN_FILENO
mov rsi, buf ; void* buf
mov rdx, len ; size_t count
syscall

Afterwards check the return value in rax: It contains the number of characters read.

(Or a -errno code on error, e.g. if you closed stdin by running ./a.out <&- in bash. Use strace to print a decoded trace of the system calls your program makes, so you don't need to actually write error handling in toy experiments.)


References:

  • What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
  • Why does the sys_read system call end when it detects a new line?
  • How do i read single character input from keyboard using nasm (assembly) under ubuntu?
  • Using the raw keyboard mode under Linux (external site with example in 32-bit assembly)

Basic input with x64 assembly code

In your first code section you have to set the SYS_CALL to 0 for SYS_READ (as mentioned rudimentically in the other answer).

So check a Linux x64 SYS_CALL list for the appropriate parameters and try

_start:
mov rax, 0 ; set SYS_READ as SYS_CALL value
sub rsp, 8 ; allocate 8-byte space on the stack as read buffer
mov rdi, 0 ; set rdi to 0 to indicate a STDIN file descriptor
lea rsi, [rsp] ; set const char *buf to the 8-byte space on stack
mov rdx, 1 ; set size_t count to 1 for one char
syscall

How to read input from STDIN in x86_64 assembly?

First of all : there are no variables in assembly. There are just labels for some kind of data. The data is, by design, untyped - at least in real assemblers, not HLA (e.g. MASM).

Reading from the standard input is achieved by using the system call read. I assume you've already read the post you mentioned and you know how to call system calls in x64 Linux. Assuming that you're using NASM (or something that resembles its syntax), and that you want to store the input from stdin at the address buffer, where you have reserved BUFSIZE bytes of memory, executing the system call would look like this :

xor eax, eax      ; rax <- 0 (syscall number for 'read')
xor edi, edi ; edi <- 0 (stdin file descriptor)
mov rsi, buffer ; rsi <- address of the buffer. lea rsi, [rel buffer]
mov edx, BUFSIZE ; rdx <- size of the buffer
syscall ; execute read(0, buffer, BUFSIZE)

Upon returning, rax will contain the result of the syscall. If you want to know more about how it works, please consult man 2 read. Note that the syscall for read on mac is 0x2000003 instead of 0, so that first line would instead be mov rax, 0x2000003.

Parsing an integer in assembly language is not that simple, though. Since read only gives you plain binary data that appears on the standard input, you need to convert the integer value yourself. Keep in mind that what you type on the keyboard is sent to the application as ASCII codes (or any other encoding you might be using - I'm assuming ASCII here). Therefore, you need to convert the data from an ASCII-encoded decimal to binary.

A function in C for converting such a structure to a normal unsigned int could look something like this:

unsigned int parse_ascii_decimal(char *str,unsigned int strlen)
{
unsigned int ret = 0, mul = 1;
int i = strlen-1;
while(i >= 0)
{
ret += (str[i] & 0xf) * mul;
mul *= 10;
--i;
}
return ret;
}

Converting this to assembly (and extending to support signed numbers) is left as an exercise for the reader. :) (Or see NASM Assembly convert input to integer? - a simpler algorithm only has 1 multiply per iteration, with total = total*10 + digit. And you can check for the first non-digit character as you iterate instead of doing strlen separately, if the length isn't already known.)


Last but not least - the write syscall requires you to always pass a pointer to a buffer with the data that's supposed to be written to a given file descriptor. Therefore, if you want to output a newline, there is no other way but to create a buffer containing the newline sequence.

How do i read single character input from keyboard using nasm (assembly) under ubuntu?

It can be done from assembly, but it isn't easy. You can't use int 21h, that's a DOS system call and it isn't available under Linux.

To get characters from the terminal under UNIX-like operating systems (such as Linux), you read from STDIN (file number 0). Normally, the read system call will block until the user presses enter. This is called canonical mode. To read a single character without waiting for the user to press enter, you must first disable canonical mode. Of course, you'll have to re-enable it if you want line input later on, and before your program exits.

To disable canonical mode on Linux, you send an IOCTL (IO ControL) to STDIN, using the ioctl syscall. I assume you know how to make Linux system calls from assembler.

The ioctl syscall has three parameters. The first is the file to send the command to (STDIN), the second is the IOCTL number, and the third is typically a pointer to a data structure. ioctl returns 0 on success, or a negative error code on fail.

The first IOCTL you need is TCGETS (number 0x5401) which gets the current terminal parameters in a termios structure. The third parameter is a pointer to a termios structure. From the kernel source, the termios structure is defined as:

struct termios {
tcflag_t c_iflag; /* input mode flags */
tcflag_t c_oflag; /* output mode flags */
tcflag_t c_cflag; /* control mode flags */
tcflag_t c_lflag; /* local mode flags */
cc_t c_line; /* line discipline */
cc_t c_cc[NCCS]; /* control characters */
};

where tcflag_t is 32 bits long, cc_t is one byte long, and NCCS is currently defined as 19. See the NASM manual for how you can conveniently define and reserve space for structures like this.

So once you've got the current termios, you need to clear the canonical flag. This flag is in the c_lflag field, with mask ICANON (0x00000002). To clear it, compute c_lflag AND (NOT ICANON). and store the result back into the c_lflag field.

Now you need to notify the kernel of your changes to the termios structure. Use the TCSETS (number 0x5402) ioctl, with the third parameter set the the address of your termios structure.

If all goes well, the terminal is now in non-canonical mode. You can restore canonical mode by setting the canonical flag (by ORing c_lflag with ICANON) and calling the TCSETS ioctl again. always restore canonical mode before you exit

As I said, it isn't easy.

Linux asm - int 16h analogue to read raw keyboard scancodes

Normally the kernel translates keyboard scancodes into ASCII characters that you can read on a tty. But there are ways to get raw scancodes, e.g. look at how showkey(1) does it (http://kbd-project.org/) on a text console. https://wiki.archlinux.org/index.php/Keyboard_input

https://github.com/legionus/kbd/blob/2.0.4/src/showkey.c shows that you can use an ioctl(2) on a file descriptor for the console terminal to set the KBD translation mode to RAW (scancodes) or MEDIUMRAW (keycodes). Then you can make normal read system calls.

ioctl(fd, KDSKBMODE, show_keycodes ? K_MEDIUMRAW : K_RAW)

Obviously you can make these system calls from hand-written asm using syscall on x86-64 or int 0x80 on 32-bit x86, looking up the syscall numbers in asm/unistd_64.h, and the values of other constants in their respective headers.


showkey takes care to set up a watchdog timer to exit cleanly, and catch signals, because doing this intercepts keys before the kernel processes control-C or ctrl+alt+f2 sequences. So without a timeout, there'd be no way to exit the program. And if you exited without restoring normal mode, there'd be no way to type on the console to run a command to restore normal keyboard mode.

How to get keyboard ASCII input and save it to a value in .data in x86_64 Assembly?

You should be able to store a single byte by calling x64's SYS_READ system call. Here is some modified code based on your example.

input:
; get ASCII for keyboard input
; save ASCII into cha
push rbp
mov rdx, 1 ; max length
mov rsi, cha ; buffer
mov rdi, 0 ; stdin
mov rax, 0 ; sys_read
syscall
pop rbp

section .data
cha dw 0

I recommend looking up system calls in Linux for additional details.

I want my Assembly Code to takes user input and outputs it along with other text but the output isn't correct

The length of Bob!!!!!!!!!!!!!!!!!! is the length of Welcome to the club, .
This is no coincidence.
Following the write(2) system call rax contains the number of successfully written Bytes.
(This might be less than the desired number of Bytes as the manual page describes.)

Like David C. Rankin commented you will need to mind the return value of read(2).
On success, read(2) returns the number of Bytes read in rax.
However, you are overwriting this value for and with the intervening write(2) system call.
Store and recall somewhere the number of successfully read Bytes (e. g. push/pop) and you’re good.

PS:
You could save one write(2) system call by rearranging the buffer to follow after greet_1.
Then you could write(2) rax + greet1_len Bytes at once.
But one problem at a time.



Related Topics



Leave a reply



Submit