Assembly: Read integer from stdin, increment it and print to stdout
movl %edi, %ecx # store input in register %edi
movl $4, %edx # read one byte
This part is all wrong. You can't store the result of read in a register. What that's actually doing is storing the result at the address contained in %edi, which since you haven't set it, is probably somewhere you have no business storing anything. You first need to make room in memory to store the string at. You're also reading four bytes and not one.
I would replace that with something like this
subl $4, %esp
movl %esp, %ecx
movl $4, %edx
This will make room for 4 bytes on the stack, then use the top of the stack as the address to store the string at. You'll also have to modify the arguments for the write syscall to use this address.
Another problem that you'll have to deal with is that stdin and stdout usually deal with text, so what you're reading will probably be a string and not a number, to use it as a number you'll have to convert it and then convert it back before you write it out.
How to read input from STDIN in x86_64 assembly?
First of all : there are no variables in assembly. There are just labels for some kind of data. The data is, by design, untyped - at least in real assemblers, not HLA (e.g. MASM).
Reading from the standard input is achieved by using the system call read
. I assume you've already read the post you mentioned and you know how to call system calls in x64 Linux. Assuming that you're using NASM (or something that resembles its syntax), and that you want to store the input from stdin at the address buffer
, where you have reserved BUFSIZE
bytes of memory, executing the system call would look like this :
xor eax, eax ; rax <- 0 (syscall number for 'read')
xor edi, edi ; edi <- 0 (stdin file descriptor)
mov rsi, buffer ; rsi <- address of the buffer. lea rsi, [rel buffer]
mov edx, BUFSIZE ; rdx <- size of the buffer
syscall ; execute read(0, buffer, BUFSIZE)
Upon returning, rax
will contain the result of the syscall. If you want to know more about how it works, please consult man 2 read
. Note that the syscall for read
on mac is 0x2000003
instead of 0
, so that first line would instead be mov rax, 0x2000003
.
Parsing an integer in assembly language is not that simple, though. Since read
only gives you plain binary data that appears on the standard input, you need to convert the integer value yourself. Keep in mind that what you type on the keyboard is sent to the application as ASCII codes (or any other encoding you might be using - I'm assuming ASCII here). Therefore, you need to convert the data from an ASCII-encoded decimal to binary.
A function in C for converting such a structure to a normal unsigned int could look something like this:
unsigned int parse_ascii_decimal(char *str,unsigned int strlen)
{
unsigned int ret = 0, mul = 1;
int i = strlen-1;
while(i >= 0)
{
ret += (str[i] & 0xf) * mul;
mul *= 10;
--i;
}
return ret;
}
Converting this to assembly (and extending to support signed numbers) is left as an exercise for the reader. :) (Or see NASM Assembly convert input to integer? - a simpler algorithm only has 1 multiply per iteration, with total = total*10 + digit
. And you can check for the first non-digit character as you iterate instead of doing strlen separately, if the length isn't already known.)
Last but not least - the write
syscall requires you to always pass a pointer to a buffer with the data that's supposed to be written to a given file descriptor. Therefore, if you want to output a newline, there is no other way but to create a buffer containing the newline sequence.
How should I work with dynamically-sized input in NASM Assembly?
First of all you are generating a 32-bit program, not a 64-bit program. This is no problem as Linux 64-bit can run 32-bit programs if they are either statically linked (this is the case for you) or the 32-bit shared libraries are installed.
Your program contains a real bug: You are reading and writing the "EAX" register from a 1-byte field in RAM:
mov EAX, [num1]
This will normally work on little-endian computers (x86). However if the byte you want to read is at the end of the last memory page of your program you'll get a bus error.
Even more critical is the write command:
mov [result], EAX
This command will overwrite 3 bytes of memory following the "result" variable. If you extend your program by additional bytes:
num1 resb 1
num2 resb 1
result resb 1
newVariable1 resb 1
You'll overwrite these variables! To correct your program you must use the AL (and BL) register instead of the complete EAX register:
mov AL, [num1]
mov BL, [num2]
...
mov [result], AL
Another finding in your program is: You are reading from file handle #1. This is the standard output. Your program should read from file handle #0 (standard input):
mov EAX, 3 ; read
mov EBX, 0 ; standard input
...
int 0x80
But now the answer to the actual question:
The C library functions (e.g. fgets()) use buffered input. Doing it like this would be a bit to complicated for the beginning so reading one byte at a time could be a possibility.
Thinking the way "how would I solve this problem using a high-level language like C". If you don't use libraries in your assembler program you can only use system calls (section 2 man pages) as functions (e.g. you cannot use "fgets()" but only "read()").
In your case a C program reading a number from standard input could look like this:
int num1;
char c;
...
num1 = 0;
while(1)
{
if(read(0,&c,1)!=1) break;
if(c=='\r' || c=='\n') break;
num1 = 10*num1 + c - '0';
}
Now you may think about the assembler code (I typically use GNU assembler, which has another syntax, so maybe this code contains some bugs):
c resb 1
num1 resb 4
...
; Set "num1" to 0
mov EAX, 0
mov [num1], EAX
; Here our while-loop starts
next_digit:
; Read one character
mov EAX, 3
mov EBX, 0
mov ECX, c
mov EDX, 1
int 0x80
; Check for the end-of-input
cmp EAX, 1
jnz end_of_loop
; This will cause EBX to be 0.
; When modifying the BL register the
; low 8 bits of EBX are modified.
; The high 24 bits remain 0.
; So clearing the EBX register before
; reading an 8-bit number into BL is
; a method for converting an 8-bit
; number to a 32-bit number!
xor EBX, EBX
; Load the character read into BL
; Check for "\r" or "\n" as input
mov BL, [c]
cmp BL, 10
jz end_of_loop
cmp BL, 13
jz end_of_loop
; read "num1" into EAX
mov EAX, [num1]
; Multiply "num1" with 10
mov ECX, 10
mul ECX
; Add one digit
sub EBX, '0'
add EAX, EBX
; write "num1" back
mov [num1], EAX
; Do the while loop again
jmp next_digit
; The end of the loop...
end_of_loop:
; Done
Writing decimal numbers with more digits is more difficult!
will work out how far apart two letters are in the alphabet
If you are still stuck, you question requires that you determine the distance between two characters. That brings with it a number of checks you must implement. Though your question is silent on whether you need to handle both uppercase and lowercase distances, unless you are converting everything to one case or the other, you will need to determine whether both characters are of the same case to make the distance within the alphabet between those two characters valid.
Since two characters are involved, you need a way of saving the case of the first for comparison with the case of the second. Here, and in all cases where a simple state is needed, just using a byte (flag) to store the state is about as simple as anything else. For example, a byte to hold 0
if the ASCII character is not an alpha character, 1
if the character is uppercase and 2
if the character is lowercase (or whatever consistent scheme you like)
That way, when you are done with the comparisons and tests, you can simply compare the two flags for equality. If they are equal, you can proceed to subtract one from the other to get the distance (swapping if necessary) and then output the number converting the number to ASCII digits for output.
To test if the character is an uppercase character, similar to isupper()
in C, a short function is all that is needed:
; check if character isupper()
; parameters:
; ecx - address holding character
; returns;
; eax - 0 (false), 1 (true)
_isupr:
mov eax, 1 ; set return true
cmp byte[ecx], 'A' ; compare with 'A'
jge _chkZ ; char >= 'A'
mov eax, 0 ; set return false
ret
_chkZ:
cmp byte[ecx], 'Z' ; compare with 'Z'
jle _rtnupr ; <= is uppercase
mov eax, 0 ; set return false
_rtnupr:
ret
You can handle the storage for the local arrays and values you need in a couple of ways. You can either subtract from the current stack pointer to create temporary storage on the stack, or in a slightly more readable way, create labels to storage within the uninitialized segment (.bss) and use the labels as variable names. Your initialized variables go in the .data segment. For example, storage for the program could be:
section .bss
buf resb 32 ; general buffer, used by _prnuint32
bufa resb 8 ; storage for first letter line
bufb resb 8 ; storage for second letter line
lena resb 4 ; length of first letter line
lenb resb 4 ; length of second letter line
nch resb 1 ; number of digit characters in _prnuint32
ais resb 1 ; what 1st char is, 0-notalpha, 1-upper, 2-lower
bis resb 1 ; same for 2nd char
Rather than using numbers sprinkled through your syscall setups, declaring initialized labels for, e.g. stdin
and stdout
instead of using 0
and 1
make things more readable:
section .data
bufsz: equ 32
babsz: equ 8
tmsg: db "first letter : "
tlen: equ $-tmsg
ymsg: db "second letter: "
ylen: equ $-ymsg
dmsg: db "char distance: "
dlen: equ $-dmsg
emsg: db "error: not alpha or same case", 0xa
elen: equ $-emsg
nl: db 0xa
stdin: equ 0
stdout: equ 1
read: equ 3
write: equ 4
exit: equ 1
Then for your reading your character input, you would have, e.g.
mov eax, write ; prompt for 1st letter
mov ebx, stdout
mov ecx, tmsg
mov edx, tlen
int 80h ; __NR_write
mov eax, read ; read 1st letter line
mov ebx, stdin
mov ecx, bufa
mov edx, babsz
int 80h ; __NR_read
mov [lena], eax ; save no. of character in line
To then check the case of the character input, you could do:
call _isupr ; check if uppercase
cmp eax, 1 ; check return 0-false, 1-true
jne chkalwr ; if not, branch to check lowercase
mov byte[ais], 1 ; set uppercase flag for 1st letter
jmp getb ; branch to get 2nd letter
chkalwr:
call _islwr ; check if lowercase
cmp eax, 1 ; check return
jne notalpha ; 1st letter not alpha char, display error
mov byte[ais], 2 ; set lowercase flag for 1st char
The notalpha:
label just being a block to output an error in case the character isn't an alpha character or the case between the two characters don't match:
notalpha: ; show not alpha or not same case error
mov eax, write
mov ebx, stdout
mov ecx, emsg
mov edx, elen
int 80h ; __NR_write
mov ebx, 1 ; set EXIT_FAILURE
After you have completed input and classification of both characters, you now need to verify whether both character are of the same case, if so you need to compute the distance between the characters (swapping if necessary, or using an absolute value) and finally handle the conversion of the distance between them from a numeric value to ASCII digits for output. You can do something similar to the following:
chkboth:
mov al, byte[ais] ; load flags into al, bl
mov bl, byte[bis]
cmp al, bl ; compare flags equal, else not same case
jne notalpha
mov eax, write ; display distance output
mov ebx, stdout
mov ecx, dmsg
mov edx, dlen
int 80h ; __NR_write
mov al, byte[bufa] ; load chars into al, bl
mov bl, byte[bufb]
cmp al, bl ; chars equal, zero difference
jns getdiff ; 1st char >= 2nd char
push eax ; swap chars
push ebx
pop eax
pop ebx
getdiff:
sub eax, ebx ; subtract 2nd char from 1st char
call _prnuint32 ; output difference
xor ebx, ebx ; set EXIT_SUCCESS
jmp done
Putting it altogether and including the _prnuint32
function below for conversion and output of the numeric distance between characters, you would have:
section .bss
buf resb 32 ; general buffer, used by _prnuint32
bufa resb 8 ; storage for first letter line
bufb resb 8 ; storage for second letter line
lena resb 4 ; length of first letter line
lenb resb 4 ; length of second letter line
nch resb 1 ; number of digit characters in _prnuint32
ais resb 1 ; what 1st char is, 0-notalpha, 1-upper, 2-lower
bis resb 1 ; same for 2nd char
section .data
bufsz: equ 32
babsz: equ 8
tmsg: db "first letter : "
tlen: equ $-tmsg
ymsg: db "second letter: "
ylen: equ $-ymsg
dmsg: db "char distance: "
dlen: equ $-dmsg
emsg: db "error: not alpha or same case", 0xa
elen: equ $-emsg
nl: db 0xa
stdin: equ 0
stdout: equ 1
read: equ 3
write: equ 4
exit: equ 1
section .text
global _start:
_start:
mov byte[ais], 0 ; zero flags
mov byte[bis], 0
mov eax, write ; prompt for 1st letter
mov ebx, stdout
mov ecx, tmsg
mov edx, tlen
int 80h ; __NR_write
mov eax, read ; read 1st letter line
mov ebx, stdin
mov ecx, bufa
mov edx, babsz
int 80h ; __NR_read
mov [lena], eax ; save no. of character in line
call _isupr ; check if uppercase
cmp eax, 1 ; check return 0-false, 1-true
jne chkalwr ; if not, branch to check lowercase
mov byte[ais], 1 ; set uppercase flag for 1st letter
jmp getb ; branch to get 2nd letter
chkalwr:
call _islwr ; check if lowercase
cmp eax, 1 ; check return
jne notalpha ; 1st letter not alpha char, display error
mov byte[ais], 2 ; set lowercase flag for 1st char
getb:
mov eax, write ; prompt for 2nd letter
mov ebx, stdout
mov ecx, ymsg
mov edx, ylen
int 80h ; __NR_write
mov eax, read ; read 2nd letter line
mov ebx, stdin
mov ecx, bufb
mov edx, babsz
int 80h ; __NR_read
mov [lenb], eax ; save no. of character in line
call _isupr ; same checks for 2nd character
cmp eax, 1
jne chkblwr
mov byte[bis], 1
jmp chkboth
chkblwr:
call _islwr
cmp eax, 1
jne notalpha
mov byte[bis], 2
chkboth:
mov al, byte[ais] ; load flags into al, bl
mov bl, byte[bis]
cmp al, bl ; compare flags equal, else not same case
jne notalpha
mov eax, write ; display distance output
mov ebx, stdout
mov ecx, dmsg
mov edx, dlen
int 80h ; __NR_write
mov al, byte[bufa] ; load chars into al, bl
mov bl, byte[bufb]
cmp al, bl ; chars equal, zero difference
jns getdiff ; 1st char >= 2nd char
push eax ; swap chars
push ebx
pop eax
pop ebx
getdiff:
sub eax, ebx ; subtract 2nd char from 1st char
call _prnuint32 ; output difference
xor ebx, ebx ; set EXIT_SUCCESS
jmp done
notalpha: ; show not alpha or not same case error
mov eax, write
mov ebx, stdout
mov ecx, emsg
mov edx, elen
int 80h ; __NR_write
mov ebx, 1 ; set EXIT_FAILURE
done:
mov eax, exit ; __NR_exit
int 80h
; print unsigned 32-bit number to stdout
; arguments:
; eax - number to output
; returns:
; none
_prnuint32:
mov byte[nch], 0 ; zero nch counter
mov ecx, 0xa ; base 10 (and newline)
lea esi, [buf + 31] ; load address of last char in buf
mov [esi], cl ; put newline in buf
inc byte[nch] ; increment char count in buf
_todigit: ; do {
xor edx, edx ; zero remainder register
div ecx ; edx=remainder = low digit = 0..9. eax/=10
or edx, '0' ; convert to ASCII
dec esi ; backup to next char in buf
mov [esi], dl ; copy ASCII digit to buf
inc byte[nch] ; increment char count in buf
test eax, eax ; } while (eax);
jnz _todigit
mov eax, 4 ; __NR_write from /usr/include/asm/unistd_32.h
mov ebx, 1 ; fd = STDOUT_FILENO
mov ecx, esi ; copy address in esi to ecx (addr of 1st digit)
; subtracting to find length.
mov dl, byte[nch] ; length, including the \n
int 80h ; write(1, string, digits + 1)
ret
; check if character islower()
; parameters:
; ecx - address holding character
; returns;
; eax - 0 (false), 1 (true)
_islwr:
mov eax, 1 ; set return true
cmp byte[ecx], 'a' ; compare with 'a'
jge _chkz ; char >= 'a'
mov eax, 0 ; set return false
ret
_chkz:
cmp byte[ecx], 'z' ; compare with 'z'
jle _rtnlwr ; <= is lowercase
mov eax, 0 ; set return false
_rtnlwr:
ret
; check if character isupper()
; parameters:
; ecx - address holding character
; returns;
; eax - 0 (false), 1 (true)
_isupr:
mov eax, 1 ; set return true
cmp byte[ecx], 'A' ; compare with 'A'
jge _chkZ ; char >= 'A'
mov eax, 0 ; set return false
ret
_chkZ:
cmp byte[ecx], 'Z' ; compare with 'Z'
jle _rtnupr ; <= is uppercase
mov eax, 0 ; set return false
_rtnupr:
ret
There are many ways to write the varying pieces and this is intended to fall more on the easier to follow side than the most efficient way it can be written side.
Example Use/Output
After you compile and link the code, e.g.
nasm -f elf -o ./obj/char_dist_32.o char_dist_32.asm
ld -m elf_i386 -o ./bin/char_dist_32 ./obj/char_dist_32.o
You can test with the inputs given in your question and others, e.g.
$ ./bin/char_dist_32
first letter : a
second letter: e
char distance: 4
$ ./bin/char_dist_32
first letter : d
second letter: b
char distance: 2
$ ./bin/char_dist_32
first letter : D
second letter: B
char distance: 2
$ ./bin/char_dist_32
first letter : a
second letter: Z
error: not alpha or same case
Look things over and let me know if you have further questions.
Nasm increment register over 9 can't display
The problem is you are displaying a number as a character.
add ebx, '0'
is a good way to convert a digit to a character for display. It is a bad way to convert a number to a character for display.
You want the following:
; variable in ebx
itoa:
mov eax, ebx
mov ecx, 10
mov esi, buf + 10
xor edx, edx
.nxt
div ecx
add dl, '0'
dec esi
mov [esi], dl
or eax, eax
jnz .nxt
mov edx, buf + 10
sub edx, esi
ret
; pointer in esi, length in edx
;... (bss area)
buf resb 10
Related Topics
Tcp: Server Sends [Rst, Ack] Immediately After Receiving [Syn] from Client
How to Search an Image for Subimages Using Linux Console
Rename File Command in Unix with Timestamp
Extract Average Time from Ping -C
Decrypt Obfuscated Perl Script
Replace a Text with a Variable
See Socket Options on Existing Sockets Created by Other Apps
Executing Script on Receiving Incoming Connection with Xinetd
Return Code When Os Kills Your Process
Automated Test Tools for Linux/Ncurses
Sonar - Measure Code Coverage Using Cobertura
Chmod 777 to a Folder and All Contents
How to Create an Rs256 Jwt Assertion with Bash/Shell Scripting