Assembly Divisions and Floating Points

Assembly divisions and floating points

  1. You need to zero edx before calling div ecx. When using a 32-bit divisor (e.g, ecx), div divides the 64-bit value in edx:eax by its argument, so if there's junk in edx, it's being treated as part of the dividend.

  2. After the div, you probably want to compare edx, not just dx.

x86 Assembly: Division Floating Point Exception dividing by 11

You're getting divide overflow because the quotient doesn't fit within a 16 bit integer.

You can split up the dividend into upper and lower halves to produce up to a 32 bit quotient and 16 bit remainder. The remainder of dx = 0000 : ax = upper_dividend / divisor becomes the upper half of 2nd dividend for the 2nd division, so the 2nd division calculates dx = remainder : ax = lower_dividend / divisor, neither of which can't overflow because the remainder is strictly less than the divisor. This process can be extended for longer dividends and quotients, one step per word of dividend and quotient, with the remainder of each divide step becoming the upper half of the partial dividend for the next step.

Example using MASM syntax:

dvnd    dd 859091
dvsr dw 11
; ...
; bx:ax will end up = quotient of dvnd/dvsr, dx = remainder
mov di,dvsr
xor dx,dx
mov ax,word ptr dvnd+2 ;ax = upr dvnd
div di ;ax = upr quot, dx = rmdr
mov bx,ax ;bx = upr quot
mov ax,word ptr dvnd ;ax = lwr dvnd
div di ;ax = lwr quot, dx = rmdr

example for quad word:

dvnd    dq 0123456789abcdefh
dvsr dw 012h
quot dq 0
rmdr dw 0
; ...
mov di,dvsr
xor dx,dx ;dx = 1st upr half dvnd = 0

mov ax,word ptr dvnd+6 ;ax = 1st lwr half dvnd
div di ;ax = 1st quot, dx = rmdr = 2nd upr half dvnd
mov word ptr quot+6,ax

mov ax,word ptr dvnd+4 ;ax = 2nd lwr half dvnd
div di ;ax = 2nd quot, dx = rmdr = 3rd upr half dvnd
mov word ptr quot+4,ax

mov ax,word ptr dvnd+2 ;ax = 3rd lwr half dvnd
div di ;ax = 3rd quot, dx = rmdr = 4th upr half dvnd
mov word ptr quot+2,ax

mov ax,word ptr dvnd ;ax = 4th lwr half dvnd
div di ;ax = 4th quot, dx = rmdr
mov word ptr quot,ax

mov rmdr,dx

Floating Point Exception - Division between integers

Linux maps the #DE (division by zero exception) generated by the CPU to the SIGFPE signal, which is then translated to the human-readable error message "Floating point exception"; unfortunately, it's quite misleading, as no floating point at all is involved into the process.

Now, given that rbx is 2, of course you are not dividing by zero. Still, there's another case when x86 generates a #DE exception: if the result is too big to fit into the target register.

In your code you are using the 64 bit form of the instruction (you wrote rbx - a 64 bit register - as divisor) which means that you are asking to divide rdx:rax (i.e. the 128 bit value obtained by joining rdx and rax) by rbx, and to put the result into rax (quotient) and rdx (remainder).

Since you are not zeroing out rdx, most probably it contains some big garbage value (residual from some previous computation?), and the division by two results in a quotient too big for rax. Hence, the #DE exception.

Long story short: zero out rdx before the div and everything will work out smoothly.

Why IDIV with -1 causes floating point exception?

Note that I know I need to sign extend edx:eax ...

If you don't sign-extend eax, edx:eax is interpreted as 64-bit signed number:

In your case, this would be 0x00000000fffffffb which is 4294967291 (and not -5).

div and idiv will cause an exception in two cases:

  • You divide by zero
  • The division result is not in the range that can be represented by the eax register

eax can hold signed numbers in the range from -2147483648 to +2147483647, but -4294967291 lies outside that range. You'll get an exception.

should not cause an FPE.

Indeed, div and idiv will cause an "integer division exception", not "a floating-point exception".

However, many OSs will show the message "floating point exception"; POSIX defines SIGFPE as covering any arithmetic exception.

Assembly equation, Divide to get float value

Without diverging far from what I believe the intent of the code is and my interpretation of what is being asked for I would suggest something like:

                .MODEL SMALL          ; I Assume we are producing EXE program

c EQU 3 ; 3 is a constant in the equation

Dane SEGMENT 'DATA'
a DW 20 ; a, b, d are variables in the equation
b DW 10 ; so treat them as variables
d DW 5 ; All these variables should be DW
Wynik DW ?

Dane ENDS

Kod SEGMENT 'CODE'
ASSUME CS:Kod, DS:Dane, SS:Stosik

Start:
mov ax, SEG Dane ; For EXE we need to set DS
mov ds, ax ; To Dane segment manually

mov ax, a ; Multiplying a by 3 is the same
; as multiplying a by 2 and adding a
shl ax, 1 ; Multiply a*2
add ax, a ; Add a to previous result in a
mov cx, ax ; Copy result of a*3 to CX
mov ax, b ; Do div b/a
xor dx, dx ; We need to ensure DX is zerofor this div
; as Div is result of DX:AX by a
div a
sub cx, ax ; Subtract reslt of b/a from result of a*3
mov ax, d ; ax = d + 3
add ax, c
mul cx ; Multiple d+3 (AX) by a*3-b/a (cx)

mov Wynik, ax ; Save 16-bit result in memory

mov ax, 4C05h ; Exit with value 5
int 21h

Kod ENDS

Stosik SEGMENT STACK

DB 100h DUP (?)

Stosik ENDS

END

The program keeps with the spirit of the original fixing the syntax and logic errors. b/a is still using integer division (you will have to ask your TA or professor about that) which will round result down to nearest whole number (in case of 10/20 that is 0). Main problems in this code are:

  • Some of the code was placed out of order
  • Your div is the division of DX:AX by a 16-bit value so we need to zero DX.
  • In some places the register names were altered.
  • In this code 3*a is being represented as a*2+a=3a. Multiplying by 2 is the same as shifting the value left by 1.

If the professor requires a better approximation to the result by still using integer division then Jester's suggestion of rearranging the equation to be 3*a*(d+3)-(b*(d+3))/a is a good one. This defers the division to a point where the rounding down of integer division has less effect on the result, so the final result should only be off by almost 1. Code that uses this revised equation would look like:

            mov     ax, SEG Dane  ; For EXE we need to set DS
mov ds, ax ; To Dane segment manually

mov cx, a
shl cx, 1
add cx, a ; cx = 2*a+a = a*3
mov ax, d
add ax, c ; ax = d+c = d+3
mov bx, ax ; bx = copy of d+3
mul cx
mov si, ax ; si = a*3*(d+3)
mov ax, bx
mul b ; ax = b*(d+3)
xor dx, dx ; Avoid division overflow, set DX=0
div a ; ax = b*(d+3)/a
sub si, ax ; si = a*3*(d+3) - b*(d+3)/a
mov Wynik, si ; Save 16-bit result in memory

A slight improvement can be made with this variation. When integer division produces a result it's rounded down to the nearest whole number. If you divide 99/100 you will get 0 with div and a remainder of 99. The answer is much closer to 1 than 0. Usually you round up when something is >= .5 and round down < .5 . It is possible to use the remainder (DX) from div to adjust the final result up by 1 if need be or to keep the result as is. The amended code could look like:

                mov     ax, SEG Dane  ; For EXE we need to set DS
mov ds, ax ; To Dane segment manually

mov cx, a
shl cx, 1
add cx, a ; cx = a*3
mov ax, d
add ax, c ; ax = d+c = d+3
mov bx, ax ; bx = copy of d+3
mul cx
mov si, ax ; si = a*3*(d+3)
mov ax, bx
mul b ; ax = b*(d+3)
xor dx, dx ; Avoid division overflow, set DX=0
div a ; ax = b*(d+3)/a

shl dx, 1 ; Remainder(DX) = Remainder(DX) * 2
cmp dx, a ; Ajustment of whole nuber needed?
jb .noadjust ; No? Then skip adjust
add ax, 1 ; Else we add 1 to quotient
.noadjust:
sub si, ax ; si = a*3*(d+3) - b*(d+3)/a
mov Wynik, si ; Save 16-bit result in memory

mov ax, 4C05h ; Exit with value 5
int 21h

The adjustment is based on the method in Rounding Half Up. Essentially if the remainder (DX) times 2 is less than the divisor a then no adjustment is needed, otherwise the quotient (AX) needs to be increased by 1


The results of the first version would by 480. The result of the second is 476. The second will be closer to the expected value. In this case the result of 476 happens to be exact. (3*20-10/20)*(5+3) = 59.5*8 = 476.

Getting floating point exception while trying to use div in assembly

The div instruction divides the two-word parameter dx/ax by the operand. If the quotient is too large to fit into a word, it will throw that exception.

Reference: http://siyobik.info.gf/main/reference/instruction/DIV

What do you have in the dx register? Most likely dx/ax divided by 15 does not fit in a 16-bit word.

Strength reduction on floating point division by hand

If we adopt Wikipedia's definition that

strength reduction is a compiler optimization where expensive operations are replaced with equivalent but less expensive operations

then we can apply strength reduction here by converting the expensive floating-point division into a floating-point multiply plus two floating-point multiply-adds (FMAs). Assuming that double is mapped to IEEE-754 binary64, the default rounding mode for floating-point computation is round-to-nearest-or-even, and that int is a 32-bit type, we can prove the transformation correct by simple exhaustive test:

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <math.h>

int main (void)
{
const double rcp_5p3 = 1.0 / 5.3; // 0x1.826a439f656f2p-3
int i = INT_MAX;
do {
double ref = i / 5.3;
double res = fma (fma (-5.3, i * rcp_5p3, i), rcp_5p3, i * rcp_5p3);
if (res != ref) {
printf ("error: i=%2d res=%23.13a ref=%23.13a\n", i, res, ref);
return EXIT_FAILURE;
}
i--;
} while (i >= 0);
return EXIT_SUCCESS;
}

Most modern instances of common processors architectures like x86-64 and ARM64 have hardware support for FMA, such that fma() can be mapped directly to the appropriate hardware instruction. This should be confirmed by looking at the disassembly of the binary generated. Where hardware support for FMA is lacking the transformation obviously should not be applied, as software implementations of fma() are slow and sometimes functionally incorrect.

The basic idea here is that mathematically, division is equivalent to multiplication with the reciprocal. However, that is not necessarily true for finite-precision floating-point arithmetic. The code above tries to improve the likelihood of bit-accurate computation by determining the error in the naive approach with the help of FMA and applying a correction where necessary. For background including literature references see this earlier question.

To the best of my knowledge, there is not yet a general mathematically proven algorithm to determine for which divisors paired with which dividends the above transformation is safe (that is, delivers bit-accurate results), which is why an exhaustive test is strictly necessary to show that the transformation is valid.

In comments, Pascal Cuoq points out that there is an alternative algorithm to potentially strength-reduce floating-point division with a compile-time constant divisor, by precomputing the reciprocal of the divisor to more than native precision and specifically as a double-double. For background see N. Brisebarre and J.-M. Muller, "Correctly rounded multiplication by arbirary precision constant", IEEE Transactions on Computers, 57(2): 162-174, February 2008, which also provides guidance how to determine whether that transformation is safe for any particular constant. Since the present case is simple, I again used exhaustive test to show it is safe. Where applicable, this will reduce the division down to one FMA plus a multiply:

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <mathimf.h>

int main (void)
{
const double rcp_5p3_hi = 1.8867924528301888e-1; // 0x1.826a439f656f2p-3
const double rcp_5p3_lo = -7.2921377017921457e-18;// -0x1.0d084b1883f6e0p-57
int i = INT_MAX;
do {
double ref = i / 5.3;
double res = fma (i, rcp_5p3_hi, i * rcp_5p3_lo);
if (res != ref) {
printf ("i=%2d res=%23.13a ref=%23.13a\n", i, res, ref);
return EXIT_FAILURE;
}
i--;
} while (i >= 0);
return EXIT_SUCCESS;
}


Related Topics



Leave a reply



Submit