Assembly: Read Integer from Stdin, Increment It and Print to Stdout

Assembly: Read integer from stdin, increment it and print to stdout

movl %edi, %ecx    # store input in register %edi
movl $4, %edx       # read one byte

This part is all wrong. You can't store the result of read in a register. What that's actually doing is storing the result at the address contained in %edi, which since you haven't set it, is probably somewhere you have no business storing anything. You first need to make room in memory to store the string at. You're also reading four bytes and not one.

I would replace that with something like this

subl $4, %esp
movl %esp, %ecx
movl $4, %edx

This will make room for 4 bytes on the stack, then use the top of the stack as the address to store the string at. You'll also have to modify the arguments for the write syscall to use this address.

Another problem that you'll have to deal with is that stdin and stdout usually deal with text, so what you're reading will probably be a string and not a number, to use it as a number you'll have to convert it and then convert it back before you write it out.

How to read input from STDIN in x86_64 assembly?

First of all : there are no variables in assembly. There are just labels for some kind of data. The data is, by design, untyped - at least in real assemblers, not HLA (e.g. MASM).

Reading from the standard input is achieved by using the system call read. I assume you've already read the post you mentioned and you know how to call system calls in x64 Linux. Assuming that you're using NASM (or something that resembles its syntax), and that you want to store the input from stdin at the address buffer, where you have reserved BUFSIZE bytes of memory, executing the system call would look like this :

xor eax, eax      ; rax <- 0 (syscall number for 'read')
xor edi, edi      ; edi <- 0 (stdin file descriptor)
mov rsi, buffer   ; rsi <- address of the buffer.  lea rsi, [rel buffer]
mov edx, BUFSIZE  ; rdx <- size of the buffer
syscall           ; execute  read(0, buffer, BUFSIZE)

Upon returning, rax will contain the result of the syscall. If you want to know more about how it works, please consult man 2 read. Note that the syscall for read on mac is 0x2000003 instead of 0, so that first line would instead be mov rax, 0x2000003.

Parsing an integer in assembly language is not that simple, though. Since read only gives you plain binary data that appears on the standard input, you need to convert the integer value yourself. Keep in mind that what you type on the keyboard is sent to the application as ASCII codes (or any other encoding you might be using - I'm assuming ASCII here). Therefore, you need to convert the data from an ASCII-encoded decimal to binary.

A function in C for converting such a structure to a normal unsigned int could look something like this:

unsigned int parse_ascii_decimal(char *str,unsigned int strlen)
{
    unsigned int ret = 0, mul = 1;
    int i = strlen-1;
    while(i >= 0)
    {
        ret += (str[i] & 0xf) * mul;
        mul *= 10;
        --i;
    }
    return ret;
}

Converting this to assembly (and extending to support signed numbers) is left as an exercise for the reader. :) (Or see NASM Assembly convert input to integer? - a simpler algorithm only has 1 multiply per iteration, with total = total*10 + digit. And you can check for the first non-digit character as you iterate instead of doing strlen separately, if the length isn't already known.)

Last but not least - the write syscall requires you to always pass a pointer to a buffer with the data that's supposed to be written to a given file descriptor. Therefore, if you want to output a newline, there is no other way but to create a buffer containing the newline sequence.

How should I work with dynamically-sized input in NASM Assembly?

First of all you are generating a 32-bit program, not a 64-bit program. This is no problem as Linux 64-bit can run 32-bit programs if they are either statically linked (this is the case for you) or the 32-bit shared libraries are installed.

Your program contains a real bug: You are reading and writing the "EAX" register from a 1-byte field in RAM:

mov EAX, [num1]

This will normally work on little-endian computers (x86). However if the byte you want to read is at the end of the last memory page of your program you'll get a bus error.

Even more critical is the write command:

mov [result], EAX

This command will overwrite 3 bytes of memory following the "result" variable. If you extend your program by additional bytes:

num1 resb 1
num2 resb 1
result resb 1
newVariable1 resb 1

You'll overwrite these variables! To correct your program you must use the AL (and BL) register instead of the complete EAX register:

mov AL, [num1]
mov BL, [num2]
...
mov [result], AL

Another finding in your program is: You are reading from file handle #1. This is the standard output. Your program should read from file handle #0 (standard input):

mov EAX, 3 ; read
mov EBX, 0 ; standard input
...
int 0x80

But now the answer to the actual question:

The C library functions (e.g. fgets()) use buffered input. Doing it like this would be a bit to complicated for the beginning so reading one byte at a time could be a possibility.

Thinking the way "how would I solve this problem using a high-level language like C". If you don't use libraries in your assembler program you can only use system calls (section 2 man pages) as functions (e.g. you cannot use "fgets()" but only "read()").

In your case a C program reading a number from standard input could look like this:

int num1;
char c;
...
num1 = 0;
while(1)
{
    if(read(0,&c,1)!=1) break;
    if(c=='\r' || c=='\n') break;
    num1 = 10*num1 + c - '0';
}

Now you may think about the assembler code (I typically use GNU assembler, which has another syntax, so maybe this code contains some bugs):

c resb 1
num1 resb 4

...

    ; Set "num1" to 0
  mov EAX, 0
  mov [num1], EAX
    ; Here our while-loop starts
next_digit:
    ; Read one character
  mov EAX, 3
  mov EBX, 0
  mov ECX, c
  mov EDX, 1
  int 0x80
    ; Check for the end-of-input
  cmp EAX, 1
  jnz end_of_loop
    ; This will cause EBX to be 0.
    ; When modifying the BL register the
    ; low 8 bits of EBX are modified.
    ; The high 24 bits remain 0.
    ; So clearing the EBX register before
    ; reading an 8-bit number into BL is
    ; a method for converting an 8-bit
    ; number to a 32-bit number!
  xor EBX, EBX
    ; Load the character read into BL
    ; Check for "\r" or "\n" as input
  mov BL, [c]
  cmp BL, 10
  jz end_of_loop
  cmp BL, 13
  jz end_of_loop
    ; read "num1" into EAX
  mov EAX, [num1]
    ; Multiply "num1" with 10
  mov ECX, 10
  mul ECX
    ; Add one digit
  sub EBX, '0'
  add EAX, EBX
    ; write "num1" back
  mov [num1], EAX
    ; Do the while loop again
  jmp next_digit
    ; The end of the loop...
end_of_loop:
    ; Done

Writing decimal numbers with more digits is more difficult!

will work out how far apart two letters are in the alphabet

If you are still stuck, you question requires that you determine the distance between two characters. That brings with it a number of checks you must implement. Though your question is silent on whether you need to handle both uppercase and lowercase distances, unless you are converting everything to one case or the other, you will need to determine whether both characters are of the same case to make the distance within the alphabet between those two characters valid.

Since two characters are involved, you need a way of saving the case of the first for comparison with the case of the second. Here, and in all cases where a simple state is needed, just using a byte (flag) to store the state is about as simple as anything else. For example, a byte to hold 0 if the ASCII character is not an alpha character, 1 if the character is uppercase and 2 if the character is lowercase (or whatever consistent scheme you like)

That way, when you are done with the comparisons and tests, you can simply compare the two flags for equality. If they are equal, you can proceed to subtract one from the other to get the distance (swapping if necessary) and then output the number converting the number to ASCII digits for output.

To test if the character is an uppercase character, similar to isupper() in C, a short function is all that is needed:

; check if character isupper()
; parameters:
;   ecx - address holding character
; returns;
;   eax - 0 (false), 1 (true)
_isupr:
    
    mov eax, 1                  ; set return true
    
    cmp byte[ecx], 'A'          ; compare with 'A'
    jge _chkZ                   ; char >= 'A'
    
    mov eax, 0                  ; set return false
    ret
    
  _chkZ:
    cmp byte[ecx], 'Z'          ; compare with 'Z'
    jle _rtnupr                 ; <= is uppercase
    
    mov eax, 0                  ; set return false
    
  _rtnupr:
    ret

You can handle the storage for the local arrays and values you need in a couple of ways. You can either subtract from the current stack pointer to create temporary storage on the stack, or in a slightly more readable way, create labels to storage within the uninitialized segment (.bss) and use the labels as variable names. Your initialized variables go in the .data segment. For example, storage for the program could be:

section .bss

    buf     resb 32             ; general buffer, used by _prnuint32
    bufa    resb 8              ; storage for first letter line
    bufb    resb 8              ; storage for second letter line
    lena    resb 4              ; length of first letter line
    lenb    resb 4              ; length of second letter line
    nch     resb 1              ; number of digit characters in _prnuint32
    ais     resb 1              ; what 1st char is, 0-notalpha, 1-upper, 2-lower
    bis     resb 1              ; same for 2nd char

Rather than using numbers sprinkled through your syscall setups, declaring initialized labels for, e.g. stdin and stdout instead of using 0 and 1 make things more readable:

section .data

    bufsz:  equ 32
    babsz:  equ 8
    tmsg:   db "first letter : "
    tlen:   equ $-tmsg
    ymsg:   db "second letter: "
    ylen:   equ $-ymsg
    dmsg:   db "char distance: "
    dlen:   equ $-dmsg
    emsg:   db "error: not alpha or same case", 0xa
    elen:   equ $-emsg
    nl:     db 0xa
    stdin:  equ 0
    stdout: equ 1
    read:   equ 3
    write:  equ 4
    exit:   equ 1

Then for your reading your character input, you would have, e.g.

    mov     eax, write          ; prompt for 1st letter
    mov     ebx, stdout
    mov     ecx, tmsg
    mov     edx, tlen
    int 80h                     ; __NR_write
    
    mov     eax, read           ; read 1st letter line
    mov     ebx, stdin
    mov     ecx, bufa
    mov     edx, babsz
    int 80h                     ; __NR_read
    
    mov     [lena], eax         ; save no. of character in line

To then check the case of the character input, you could do:

    call    _isupr              ; check if uppercase
    cmp     eax, 1              ; check return 0-false, 1-true
    jne chkalwr                 ; if not, branch to check lowercase
    
    mov     byte[ais], 1        ; set uppercase flag for 1st letter
    
    jmp getb                    ; branch to get 2nd letter
    
  chkalwr:
    call    _islwr              ; check if lowercase
    cmp     eax, 1              ; check return
    jne notalpha                ; 1st letter not alpha char, display error
    
    mov     byte[ais], 2        ; set lowercase flag for 1st char

The notalpha: label just being a block to output an error in case the character isn't an alpha character or the case between the two characters don't match:

  notalpha:                     ; show not alpha or not same case error
    mov     eax, write
    mov     ebx, stdout
    mov     ecx, emsg
    mov     edx, elen
    int 80h                     ; __NR_write
    
    mov     ebx, 1              ; set EXIT_FAILURE

After you have completed input and classification of both characters, you now need to verify whether both character are of the same case, if so you need to compute the distance between the characters (swapping if necessary, or using an absolute value) and finally handle the conversion of the distance between them from a numeric value to ASCII digits for output. You can do something similar to the following:

  chkboth:
    mov     al, byte[ais]       ; load flags into al, bl
    mov     bl, byte[bis]
    cmp     al, bl              ; compare flags equal, else not same case
    jne notalpha
    
    mov     eax, write          ; display distance output
    mov     ebx, stdout
    mov     ecx, dmsg
    mov     edx, dlen
    int 80h                     ; __NR_write
    
    mov     al, byte[bufa]      ; load chars into al, bl
    mov     bl, byte[bufb]
    cmp     al, bl              ; chars equal, zero difference
    
    jns getdiff                 ; 1st char >= 2nd char
    
    push    eax                 ; swap chars
    push    ebx
    pop     eax
    pop     ebx
    
  getdiff:
    sub     eax, ebx            ; subtract 2nd char from 1st char
    call _prnuint32             ; output difference
    
    xor     ebx, ebx            ; set EXIT_SUCCESS
    jmp     done

Putting it altogether and including the _prnuint32 function below for conversion and output of the numeric distance between characters, you would have:

section .bss

    buf     resb 32             ; general buffer, used by _prnuint32
    bufa    resb 8              ; storage for first letter line
    bufb    resb 8              ; storage for second letter line
    lena    resb 4              ; length of first letter line
    lenb    resb 4              ; length of second letter line
    nch     resb 1              ; number of digit characters in _prnuint32
    ais     resb 1              ; what 1st char is, 0-notalpha, 1-upper, 2-lower
    bis     resb 1              ; same for 2nd char

section .data

    bufsz:  equ 32
    babsz:  equ 8
    tmsg:   db "first letter : "
    tlen:   equ $-tmsg
    ymsg:   db "second letter: "
    ylen:   equ $-ymsg
    dmsg:   db "char distance: "
    dlen:   equ $-dmsg
    emsg:   db "error: not alpha or same case", 0xa
    elen:   equ $-emsg
    nl:     db 0xa
    stdin:  equ 0
    stdout: equ 1
    read:   equ 3
    write:  equ 4
    exit:   equ 1

section .text

    global _start:
_start:
    
    mov     byte[ais], 0        ; zero flags
    mov     byte[bis], 0
    
    mov     eax, write          ; prompt for 1st letter
    mov     ebx, stdout
    mov     ecx, tmsg
    mov     edx, tlen
    int 80h                     ; __NR_write
    
    mov     eax, read           ; read 1st letter line
    mov     ebx, stdin
    mov     ecx, bufa
    mov     edx, babsz
    int 80h                     ; __NR_read
    
    mov     [lena], eax         ; save no. of character in line
    
    call    _isupr              ; check if uppercase
    cmp     eax, 1              ; check return 0-false, 1-true
    jne chkalwr                 ; if not, branch to check lowercase
    
    mov     byte[ais], 1        ; set uppercase flag for 1st letter
    
    jmp getb                    ; branch to get 2nd letter
    
  chkalwr:
    call    _islwr              ; check if lowercase
    cmp     eax, 1              ; check return
    jne notalpha                ; 1st letter not alpha char, display error
    
    mov     byte[ais], 2        ; set lowercase flag for 1st char
    
  getb:
    mov     eax, write          ; prompt for 2nd letter
    mov     ebx, stdout
    mov     ecx, ymsg
    mov     edx, ylen
    int 80h                     ; __NR_write
    
    mov     eax, read           ; read 2nd letter line
    mov     ebx, stdin
    mov     ecx, bufb
    mov     edx, babsz
    int 80h                     ; __NR_read
    
    mov     [lenb], eax         ; save no. of character in line
    
    call    _isupr              ; same checks for 2nd character
    cmp     eax, 1
    jne chkblwr
    
    mov     byte[bis], 1
    
    jmp chkboth
    
  chkblwr:
    call    _islwr
    cmp     eax, 1
    jne notalpha
    
    mov     byte[bis], 2
    
  chkboth:
    mov     al, byte[ais]       ; load flags into al, bl
    mov     bl, byte[bis]
    cmp     al, bl              ; compare flags equal, else not same case
    jne notalpha
    
    mov     eax, write          ; display distance output
    mov     ebx, stdout
    mov     ecx, dmsg
    mov     edx, dlen
    int 80h                     ; __NR_write
    
    mov     al, byte[bufa]      ; load chars into al, bl
    mov     bl, byte[bufb]
    cmp     al, bl              ; chars equal, zero difference
    
    jns getdiff                 ; 1st char >= 2nd char
    
    push    eax                 ; swap chars
    push    ebx
    pop     eax
    pop     ebx
    
  getdiff:
    sub     eax, ebx            ; subtract 2nd char from 1st char
    call _prnuint32             ; output difference
    
    xor     ebx, ebx            ; set EXIT_SUCCESS
    jmp     done
    
  notalpha:                     ; show not alpha or not same case error
    mov     eax, write
    mov     ebx, stdout
    mov     ecx, emsg
    mov     edx, elen
    int 80h                     ; __NR_write
    
    mov     ebx, 1              ; set EXIT_FAILURE
    
  done:
    mov     eax, exit           ; __NR_exit
    int 80h
    
; print unsigned 32-bit number to stdout
; arguments:
;   eax - number to output
; returns:
;   none
_prnuint32:
    mov     byte[nch], 0        ; zero nch counter
    
    mov     ecx, 0xa            ; base 10  (and newline)
    lea     esi, [buf + 31]     ; load address of last char in buf
    mov     [esi], cl           ; put newline in buf
    inc     byte[nch]           ; increment char count in buf

_todigit:                       ; do {
    xor     edx, edx            ; zero remainder register
    div     ecx                 ; edx=remainder = low digit = 0..9.  eax/=10
    
    or      edx, '0'            ; convert to ASCII
    dec     esi                 ; backup to next char in buf 
    mov     [esi], dl           ; copy ASCII digit to buf
    inc     byte[nch]           ; increment char count in buf

    test    eax, eax            ; } while (eax);
    jnz     _todigit

    mov     eax, 4              ; __NR_write from /usr/include/asm/unistd_32.h
    mov     ebx, 1              ; fd = STDOUT_FILENO
    mov     ecx, esi            ; copy address in esi to ecx (addr of 1st digit)
                                ;  subtracting to find length.
    mov     dl, byte[nch]       ; length, including the \n
    int     80h                 ; write(1, string,  digits + 1)

    ret

; check if character islower()
; parameters:
;   ecx - address holding character
; returns;
;   eax - 0 (false), 1 (true)
_islwr:
    
    mov eax, 1                  ; set return true
    
    cmp byte[ecx], 'a'          ; compare with 'a'
    jge _chkz                   ; char >= 'a'
    
    mov eax, 0                  ; set return false
    ret
    
  _chkz:
    cmp byte[ecx], 'z'          ; compare with 'z'
    jle _rtnlwr                 ; <= is lowercase
    
    mov eax, 0                  ; set return false
    
  _rtnlwr:
    ret


; check if character isupper()
; parameters:
;   ecx - address holding character
; returns;
;   eax - 0 (false), 1 (true)
_isupr:
    
    mov eax, 1                  ; set return true
    
    cmp byte[ecx], 'A'          ; compare with 'A'
    jge _chkZ                   ; char >= 'A'
    
    mov eax, 0                  ; set return false
    ret
    
  _chkZ:
    cmp byte[ecx], 'Z'          ; compare with 'Z'
    jle _rtnupr                 ; <= is uppercase
    
    mov eax, 0                  ; set return false
    
  _rtnupr:
    ret

There are many ways to write the varying pieces and this is intended to fall more on the easier to follow side than the most efficient way it can be written side.

Example Use/Output

After you compile and link the code, e.g.

nasm -f elf -o ./obj/char_dist_32.o char_dist_32.asm
ld -m elf_i386 -o ./bin/char_dist_32 ./obj/char_dist_32.o

You can test with the inputs given in your question and others, e.g.

$ ./bin/char_dist_32
first letter : a
second letter: e
char distance: 4

$ ./bin/char_dist_32
first letter : d
second letter: b
char distance: 2

$ ./bin/char_dist_32
first letter : D
second letter: B
char distance: 2

$ ./bin/char_dist_32
first letter : a
second letter: Z
error: not alpha or same case

Look things over and let me know if you have further questions.

Nasm increment register over 9 can't display

The problem is you are displaying a number as a character.

add ebx, '0'

is a good way to convert a digit to a character for display. It is a bad way to convert a number to a character for display.

You want the following:

; variable in ebx
itoa:
   mov eax, ebx
   mov ecx, 10
   mov esi, buf + 10
   xor edx, edx
.nxt
   div ecx
   add dl, '0'
   dec esi
   mov [esi], dl
   or  eax, eax
   jnz .nxt
   mov edx, buf + 10
   sub edx, esi
   ret

   ; pointer in esi, length in edx

;... (bss area)
buf resb 10

Assembly: Read Integer from Stdin, Increment It and Print to Stdout