Fast Divisibility Tests (By 2,3,4,5,.., 16)

Fast divisibility tests (by 2,3,4,5,.., 16)?

It is not a bad idea AT ALL to figure out alternatives to division instructions (which includes modulo on x86/x64) because they are very slow. Slower (or even much slower) than most people realize. Those suggesting "% n" where n is a variable are giving foolish advice because it will invariably lead to the use of the division instruction. On the other hand "% c" (where c is a constant) will allow the compiler to determine the best algorithm available in its repertoire. Sometimes it will be the division instruction but a lot of the time it won't.

In this document Torbjörn Granlund shows that the ratio of clock cycles required for unsigned 32-bit mults:divs is 4:26 (6.5x) on Sandybridge and 3:45 (15x) on K10. for 64-bit the respective ratios are 4:92 (23x) and 5:77 (14.4x).

The "L" columns denote latency. "T" columns denote throughput. This has to do with the processor's ability to handle multiple instructions in parallell. Sandybridge can issue one 32-bit multiplication every other cycle or one 64-bit every cycle. For K10 the corresponding throughput is reversed. For divisions the K10 needs to complete the entire sequence before it may begin another. I suspect it is the same for Sandybridge.

Using the K10 as an example it means that during the cycles required for a 32-bit division (45) the same number (45) of multiplications can be issued and the next-to-last and last one of these will complete one and two clock cycles after the division has completed. A LOT of work can be performed in 45 multiplications.

It is also interesting to note that divs have become less efficient with the evolution from K8-K9 to K10: from 39 to 45 and 71 to 77 clock cycles for 32- and 64-bit.

Granlund's page at gmplib.org and at the Royal Institute of Technology in Stockholm contain more goodies, some of which have been incorporated into the gcc compiler.

How to check if given number is divisible of 15 in fastest way?

Multiplication takes less time then division, so you can try this:

inline bool divisible15(unsigned int x)
{
//286331153 = (2^32 - 1) / 15
//4008636143 = (2^32) - 286331153
return x * 4008636143u <= 286331153u;
}

This way works because 2^32-1 (max 32-bit value) is divisible of 15, however if you take, for example 7, it would look like working, but wouldn't work in all cases.

EDIT: See this, it proves that this solution (on some compilers) is faster then module.

EDIT: Here is the explanation and the generalization.

Is there a bit-wise trick for checking the divisibility of a number by 2 or 3?

Yes, though it's not very pretty, you can do something analogous to the old "sum all the decimal digits until you have only one left" trick to test if a number is divisible by 9, except in binary and with divisibility by 3. You can use the same principle for other numbers as well, but many combinations of base/divisor introduce annoying scaling factors so you're not just summing digits anymore.

Anyway, 16n-1 is divisible by 3, so you can use radix 16, that is, sum the nibbles. Then you're left with one nibble (well, 5 bits really), and you can just look that up. So for example in C# (slightly tested) edit: brute-force tested, definitely works

static bool IsMultipleOf3(uint x)
{
const uint lookuptable = 0x49249249;
uint t = (x & 0x0F0F0F0F) + ((x & 0xF0F0F0F0) >> 4);
t = (t & 0x00FF00FF) + ((t & 0xFF00FF00) >> 8);
t = (t & 0x000000FF) + ((t & 0x00FF0000) >> 16);
t = (t & 0xF) + ((t & 0xF0) >> 4);
return ((lookuptable >> (int)t) & 1) != 0;
}

The trick from my comment, x * 0xaaaaaaab <= 0x55555555, works through a modular multiplicative inverse trick. 0xaaaaaaab * 3 = 1 mod 232, which means that 0xaaaaaaab * x = x / 3 if and only if

x % 3 = 0. "if" because 0xaaaaaaab * 3 * y = y (because 1 * y = y), so if x is of the form

3 * y then it will map back to y. "only if" because no two inputs are mapped to the same output, so everything not divisible by 3 will map to something higher than the highest thing you can get by dividing anything by 3 (which is 0xFFFFFFFF / 3 = 0x55555555).

You can read more about this (including the more general form, which includes a rotation) in Division by Invariant Integers using Multiplication (T. Granlund and P. L. Montgomery).

You compiler may not know this trick. For example this:

uint32_t foo(uint32_t x)
{
return x % 3 == 0;
}

Becomes, on Clang 3.4.1 for x64,

movl    %edi, %eax
movl $2863311531, %ecx # imm = 0xAAAAAAAB
imulq %rax, %rcx
shrq $33, %rcx
leal (%rcx,%rcx,2), %eax
cmpl %eax, %edi
sete %al
movzbl %al, %eax
ret

G++ 4.8:

mov eax, edi
mov edx, -1431655765
mul edx
shr edx
lea eax, [rdx+rdx*2]
cmp edi, eax
sete al
movzx eax, al
ret

What it should be:

imul eax, edi, 0xaaaaaaab
cmp eax, 0x55555555
setbe al
movzx eax, al
ret

How to approach and understand a math related DSA question

This is a very cool problem! It uses modular arithmetic and some basic number theory to devise the solution.

Let's say we have p = 11. What divisibility rule applies here? How many digits at once do we need to take, to have a divisibility rule?

Well, let's try a single digit at a time. That would mean, that if we have 121 and we sum its digits 1 + 2 + 1, then we get 4. However we see, that although 121 is divisible by 11, 4 isn't and so the rule doesn't work.

What if we take two digits at a time? With 121 we get 1 + 21 = 22. We see that 22 IS divisible by 11, so the rule might work here. And in fact, it does. For p = 11, we have r = 2.

This requires a bit of intuition which I am unable to convey in text (I really have tried) but it can be proven that for a given prime p other than 2 and 5, the divisibility rule works for tuples of digits of length r if and only if the number 99...9 (with r nines) is divisible by p. And indeed, for p = 3 we have 9 % 3 = 0, while for p = 11 we have 9 % 11 = 9 (this is bad) and 99 % 11 = 0 (this is what we want).

If we want to find such an r, we start with r = 1. We check if 9 is divisible by p. If it is, then we found the r. Otherwise, we go further and we check if 99 is divisible by p. If it is, then we return r = 2. Then, we check if 999 is divisible by p and if so, return r = 3 and so on. However, the 99...9 numbers can get very large. Thankfully, to check divisibility by p we only need to store the remainder modulo p, which we know is small (at least smaller than 999983). So the code in C++ would look something like this:

int r(int p) {
int result = 1;
int remainder = 9 % p;
while (remainder != 0) {
remainder = (remainder * 10 + 9) % p;
result++;
}
return result;
}

How to know if a binary number divides by 3?

Refer to this website: How to Tell if a Binary Number is Divisible by Three

Basically count the number of non-zero odd positions bits and non-zero even position bits from the right. If their difference is divisible by 3, then the number is divisible by 3.

For example:

15 = 1111 which has 2 odd and 2 even non-zero bits. The difference is 0. Thus 15 is divisible by 3.

185 = 10111001 which has 2 odd non-zero bits and 3 even non-zero bits. The difference is 1. Thus 185 is not divisible by 3.

Explanation

Consider the 2^n values. We know that 2^0 = 1 is congruent 1 mod 3. Thus 2^1 = 2 is congurent 2*1 = 2 mod 3. Continuing the pattern, we notice that for 2^n where n is odd, 2^n is congruent 1 mod 3 and for even it is congruent 2 mod 3 which is -1 mod 3. Thus 10111001 is congruent 1*1 + 0*-1 + 1*1 + 1*-1 + 1*1 + 0*-1 + 0*1 + 1*-1 mod 3 which is congruent 1 mod 3. Thus 185 is not divisible by 3.

Check if a number is divisible by 3

Subtract 3 until you either

a) hit 0 - number was divisible by 3

b) get a number less than 0 - number wasn't divisible

-- edited version to fix noted problems

while n > 0:
n -= 3
while n < 0:
n += 3
return n == 0


Related Topics



Leave a reply



Submit