NASM idiv can't divide correctly
How can I get the right answer with
idiv
?
You can't. The idiv
instruction performs a signed integer division. You won't receive a result that has a fraction like in 0.27.
-59 can't contain 219 so the quotient is 0, and the remainder is everything that is left which is -59.
You can use FPU or SSE instructions to calculate non-integer values.
Assembly Language - How to do Modulo?
If your modulus / divisor is a known constant, and you care about performance, see this and this. A multiplicative inverse is even possible for loop-invariant values that aren't known until runtime, e.g. see https://libdivide.com/ (But without JIT code-gen, that's less efficient than hard-coding just the steps necessary for one constant.)
Never use div
for known powers of 2: it's much slower than and
for remainder, or right-shift for divide. Look at C compiler output for examples of unsigned or signed division by powers of 2, e.g. on the Godbolt compiler explorer. If you know a runtime input is a power of 2, use lea eax, [esi-1]
; and eax, edi
or something like that to do x & (y-1)
. Modulo 256 is even more efficient: movzx eax, cl
has zero latency on recent Intel CPUs (mov-elimination), as long as the two registers are separate.
In the simple/general case: unknown value at runtime
The DIV
instruction (and its counterpart IDIV
for signed numbers) gives both the quotient and remainder. For unsigned, remainder and modulus are the same thing. For signed idiv
, it gives you the remainder (not modulus) which can be negative:
e.g. -5 / 2 = -2 rem -1
. x86 division semantics exactly match C99's %
operator.
DIV r32
divides a 64-bit number in EDX:EAX
by a 32-bit operand (in any register or memory) and stores the quotient in EAX
and the remainder in EDX
. It faults on overflow of the quotient.
Unsigned 32-bit example (works in any mode)
mov eax, 1234 ; dividend low half
mov edx, 0 ; dividend high half = 0. prefer xor edx,edx
mov ebx, 10 ; divisor can be any register or memory
div ebx ; Divides 1234 by 10.
; EDX = 4 = 1234 % 10 remainder
; EAX = 123 = 1234 / 10 quotient
In 16-bit assembly you can do div bx
to divide a 32-bit operand in DX:AX
by BX
. See Intel's Architectures Software Developer’s Manuals for more information.
Normally always use xor edx,edx
before unsigned div
to zero-extend EAX into EDX:EAX. This is how you do "normal" 32-bit / 32-bit => 32-bit division.
For signed division, use cdq
before idiv
to sign-extend EAX into EDX:EAX. See also Why should EDX be 0 before using the DIV instruction?. For other operand-sizes, use cbw
(AL->AX), cwd
(AX->DX:AX), cdq
(EAX->EDX:EAX), or cqo
(RAX->RDX:RAX) to set the top half to 0
or -1
according to the sign bit of the low half.
div
/ idiv
are available in operand-sizes of 8, 16, 32, and (in 64-bit mode) 64-bit. 64-bit operand-size is much slower than 32-bit or smaller on current Intel CPUs, but AMD CPUs only care about the actual magnitude of the numbers, regardless of operand-size.
Note that 8-bit operand-size is special: the implicit inputs/outputs are in AH:AL (aka AX), not DL:AL. See 8086 assembly on DOSBox: Bug with idiv instruction? for an example.
Signed 64-bit division example (requires 64-bit mode)
mov rax, 0x8000000000000000 ; INT64_MIN = -9223372036854775808
mov ecx, 10 ; implicit zero-extension is fine for positive numbers
cqo ; sign-extend into RDX, in this case = -1 = 0xFF...FF
idiv rcx
; quotient = RAX = -922337203685477580 = 0xf333333333333334
; remainder = RDX = -8 = 0xfffffffffffffff8
Limitations / common mistakes
div dword 10
is not encodeable into machine code (so your assembler will report an error about invalid operands).
Unlike with mul
/imul
(where you should normally use faster 2-operand imul r32, r/m32
or 3-operand imul r32, r/m32, imm8/32
instead that don't waste time writing a high-half result), there is no newer opcode for division by an immediate, or 32-bit/32-bit => 32-bit division or remainder without the high-half dividend input.
Division is so slow and (hopefully) rare that they didn't bother to add a way to let you avoid EAX and EDX, or to use an immediate directly.
div and idiv will fault if the quotient doesn't fit into one register (AL / AX / EAX / RAX, the same width as the dividend). This includes division by zero, but will also happen with a non-zero EDX and a smaller divisor. This is why C compilers just zero-extend or sign-extend instead of splitting up a 32-bit value into DX:AX.
And also why INT_MIN / -1
is C undefined behaviour: it overflows the signed quotient on 2's complement systems like x86. See Why does integer division by -1 (negative one) result in FPE? for an example of x86 vs. ARM. x86 idiv
does indeed fault in this case.
The x86 exception is #DE
- divide exception. On Unix/Linux systems, the kernel delivers a SIGFPE arithmetic exception signal to processes that cause a #DE exception. (On which platforms does integer divide by zero trigger a floating point exception?)
For div
, using a dividend with high_half < divisor
is safe. e.g. 0x11:23 / 0x12
is less than 0xff
so it fits in an 8-bit quotient.
Extended-precision division of a huge number by a small number can be implemented by using the remainder from one chunk as the high-half dividend (EDX) for the next chunk. This is probably why they chose remainder=EDX quotient=EAX instead of the other way around.
Nasm - weird results when multiplying 2 values
Your results are normal, your expectations are wrong. IDK where your wrong expectations came from, but the 64-bit result of 32-bit mul
goes in EDX:EAX.
Turns out it was from an online calculator that presumably used Javascript number, i.e. double
-precision floating point with a 53-bit mantissa, which will round 0xfffffffe00000001
to the nearest representable double, i.e. 0xfffffffe00000000
.
And for the signed case, you were just using the calculator totally wrong, expecting it to interpret your inputs as 32-bit 2's complement.
The low half of a multiply result doesn't depend on whether you interpret the inputs as signed or not, so yes as a shortcut we can use -1 * -1 = 1
to get the low half here. (That's why Intel only bothered to add efficient non-widening forms for imul
, e.g. imul eax, edx, -1
).
If we simply try it in an extended-precision calculator like calc
(packaged in Ubuntu as apcalc
, and in Arch as calc
):
; base2(16) // ask for hex output as well as decimal
; 0x0ffffffff ^ 2
18446744065119617025 /* 0xfffffffe00000001 */
; 0x0ffffff00 * 0x0ffffffff
18446742969902956800 /* 0xfffffeff00000100 */
So this confirms the CPU's results. Remember that and odd number times an odd number is an odd number, so your EAX=0 guess for 0xffffffff * 0xffffffff
can also be ruled out that way.
For signed it's a little trickier to do with an arbitrary-precision calculator:
; (0xfffffff0 - 2^32) * 5
-80 /* -0x50 */
; . + 2^64 // get the 64-bit 2's complement bit-pattern for that negative number
18446744073709551536 /* 0xffffffffffffffb0 */
That small unsigned number of course splits up into 0xffffffff
and 0xffffffb0
, with EDX just being all-ones, same as the upper bits of EAX, because that's how sign-extension works.
Related Topics
How to Set Errno in Linux Device Driver
Linux Directory Starting with Dot
Dlopen with Two Shared Libraries, Exporting Symbols
Install Python 32 Bit on 64 Bit Linux
How to Handle Error/Exception in Shell Script
Curl Http Post File Upload Using Curl -Data in Linux Command Line
Operand Generation of Call Instruction on X86-64 Amd
Recursively List Files from a Given Directory in Bash
Can Inotify Tell Me Where a Monitored File Is Moved
Sed Command Works on Linux, But Not on Os X
Can an Rpm Spec File "Include" Other Files
Gunicorn Does Not Start After Boot
Grep -F on Files in a Zipped Folder
Gnutls_Handshake() Failed: Handshake Failed Git
Finding Directories with Find in Bash Using a Exclude List