Fast Way of Counting Non-Zero Bits in Positive Integer

Fast way of counting non-zero bits in positive integer

For arbitrary-length integers, bin(n).count("1") is the fastest I could find in pure Python.

I tried adapting Óscar's and Adam's solutions to process the integer in 64-bit and 32-bit chunks, respectively. Both were at least ten times slower than bin(n).count("1") (the 32-bit version took about half again as much time).

On the other hand, gmpy popcount() took about 1/20th of the time of bin(n).count("1"). So if you can install gmpy, use that.

To answer a question in the comments, for bytes I'd use a lookup table. You can generate it at runtime:

counts = bytes(bin(x).count("1") for x in range(256))  # py2: use bytearray

Or just define it literally:

counts = (b'\x00\x01\x01\x02\x01\x02\x02\x03\x01\x02\x02\x03\x02\x03\x03\x04'
          b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
          b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
          b'\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
          b'\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06'
          b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
          b'\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07'
          b'\x04\x05\x05\x06\x05\x06\x06\x07\x05\x06\x06\x07\x06\x07\x07\x08')

Then it's counts[x] to get the number of 1 bits in x where 0 ≤ x ≤ 255.

Counting non-zero bits in a list of 32-bit integers

Your whole approach doesn't make sense in Python.

The number of 1 bits in -17621, aka -100010011010101 binary, is 7. If you're expecting 26, you're not asking for the number of 1 bits in the number, you're asking for the number of 1 bits in the C 32-bit 2's complement signed integer representation of the number interpreted as a C 32-bit unsigned integer. If you want that in Python, you have to ask for it explicitly. For example, you can use num % 0x100000000.

Meanwhile, whatever list of bit-twiddling tricks you got the (1 << 32) + num & -5 from, it's also assuming C int32 math rather than actual integer arithmetic, so it's going to be wrong again. Plus, I'm pretty sure you copied it wrong. Plus, there's no reason for these kinds of tricks in Python—odds are it'll actually be slower, rather than faster, but either way, it's going to be way, way too slow to do a zillion times in a tight loop, and more than fast enough to do a few times, so this kind of squeeze-out-the-last-5% optimization is far more pointless in Python than in C. (Although many of these old tricks actually slow things down in C as well, with modern compilers and CPUs…)

And knocking off the initial 1 bit by doing [3:] again assumes that you've got a 32-bit representation, which is again wrong. (That's why all your positive numbers were off by 1—you're knocking off the first 1 bit.)

So, let's just make it simple:

answer = []
for num in numbers:
    data = int(num) % 0x100000000
    bits = format(data, 'b')
    answer.append(str(bits.count('1')))
print(' '.join(answer))

Notice that I used format(data, 'b'), so I don't have to knock off the 0b at the beginning.

Or, if you want to make it more compact:

print(' '.join(str(format(int(num)%0x100000000, 'b').count('1')) for num in numbers))

Either way, you get:

8 12 10 26 25 27 8 0 8 18 20 28

… which does have two more numbers than your expected output, but then your input also had two more numbers than your expected output, so I think that's to be expected.

numpy vectorized way to count non-zero bits in array of integers

Assuming your array is a, you can simply do:

np.unpackbits(a.view('uint8')).sum()

example:

a = np.array([123, 44], dtype=np.uint8)
#bin(a) is [0b1111011, 0b101100]
np.unpackbits(a.view('uint8')).sum()
#9

Comparison using benchit:

#@Ehsan's solution
def m1(a):
  return np.unpackbits(a.view('uint8')).sum()

#@Valdi_Bo's solution
def m2(a):
  return sum([ bin(n).count('1') for n in a ])

in_ = [np.random.randint(100000,size=(n)) for n in [10,100,1000,10000,100000]]

m1 is significantly faster.

Sample Image

How to count the total number of set bits for the number

This would work in O(log(A)) so you shouldn't have Time Limit Exceeded :

def solve(A):
  count = 0
  n = A
  i = 0
  while n > 0:
    if (n & 1) == 1:
      f = ((1 << i) >> 1) * i
      g = (((1 << i) - 1) & A) + 1
      count += f + g
    n >>= 1
    i += 1
  return count

The total number of set bits between 0 and 2^n excluded is 2^(n-1)*n. Because in this particular case, 50% of bits are set in each column and other 50% are unset, and there is n columns.

For a number A which is not a power of 2, we can break down the calculation into several passes, one for each set bit in the number A in question, and use the expression for exact power of 2 (variable f). We must also take care of an additional column of 1 each time (variable g).

Schema to see why it works

How to skip trailing zero bits in built-it int?

You could use bitwise arithmetic (for positive and negative values, but excluding 0, for which the behavior is not defined as there is no 1 to trail):

(x & -x).bit_length() - 1

or:

(x ^ -x).bit_lenght() - 2

or, possibly with a more interesting 0 behavior (thanks to @TomKarzes comment):

(x ^ (x - 1)).bit_length() - 1

To understand how they work, let us consider this last expression. When we do x - 1, if x is odd, only that bit is flipped and then when we do the xor ^ that is the only bit that survives the operation. Similarly, when x is not odd, every trailing zero gets flipped to 1 and the trailed1 gets flipped to 0, all the other bits are untouched, and then we do the xor all the bits up to the trailed 1 gets to be set. Then, int.bit_length() counts how many significant bits are found, and the -1 is just a constant offset we need to have to get "number of trailing zeros".

Why the first methods also work is due to the two-complement representation of negative numbers. The exact details are left as an exercise to the reader.

To show that they work:

for i in range(-15, 16): 
     print(
         f'{i:5b}',
         f'{i ^ -i:3d}',
         f'{(i & -i).bit_length() - 1:2d}',
         f'{(i ^ -i).bit_length() - 2:2d}',
         f'{(i ^ (i - 1)).bit_length() - 1:2d}')

-1111  -2  0  0  0
-1110  -4  1  1  1
-1101  -2  0  0  0
-1100  -8  2  2  2
-1011  -2  0  0  0
-1010  -4  1  1  1
-1001  -2  0  0  0
-1000 -16  3  3  3
 -111  -2  0  0  0
 -110  -4  1  1  1
 -101  -2  0  0  0
 -100  -8  2  2  2
  -11  -2  0  0  0
  -10  -4  1  1  1
   -1  -2  0  0  0
    0   0 -1 -2  0
    1  -2  0  0  0
   10  -4  1  1  1
   11  -2  0  0  0
  100  -8  2  2  2
  101  -2  0  0  0
  110  -4  1  1  1
  111  -2  0  0  0
 1000 -16  3  3  3
 1001  -2  0  0  0
 1010  -4  1  1  1
 1011  -2  0  0  0
 1100  -8  2  2  2
 1101  -2  0  0  0
 1110  -4  1  1  1
 1111  -2  0  0  0

Note the different behavior of the two expressions for 0 input.
Particularly, (x ^ (x - 1)).bit_length() - 1 may lead to simpler expressions if you want 0 input to produce 0 output.

Speed-wise this should be brutally faster than any loop-based solution, and on-par with look-up based solutions (but without size limitation and without using additional memory):

def trail_zero_loop(x):
    if x != 0:
        k = -1
        r = 0
        while not r:
            # x, r = divmod(x, 2)  # ~15% slower
            r = x % 2
            x = x // 2
            k += 1
        return k
    else:
        return -1

def trail_zero_and(x):
    return (x & -x).bit_length() - 1

def trail_zero_xor(x):
    if x != 0:
        # return (x ^ -x).bit_length() - 2  # ~1% slower
        return (x ^ (x - 1)).bit_length() - 1
    else:
        return -1

_LOOKUP =[ 32, 0, 1, 26, 2, 23, 27, 0, 3, 16, 24, 30, 28, 11, 0, 13, 4, 7, 17, 0, 25, 22, 31, 15, 29, 10, 12, 6, 0, 21, 14, 9, 5, 20, 8, 19, 18 ]

def trail_zero_lookup(x):
    if x != 0:
        return _LOOKUP[(x & -x) % 37]
    else:
        return -1

n = 2 ** 30
%timeit trail_zero_loop(n)
# 3.15 µs ± 25.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 
%timeit trail_zero_and(n)
# 228 ns ± 9.87 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit trail_zero_xor(n)
# 253 ns ± 7.25 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit trail_zero_lookup(n)
# 294 ns ± 1.81 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

The eventual additional bit-shift operation takes only some nanoseconds so it should be irrelevant here.

Fastest way to sort bit sequence by toggling bits

Let Z(n) be the cost of setting the first n bits all 0.

Let X(n) be the cost of minimum cost of "sorting" the first n bits.

We have:

Z(0) = 0, X(0) = 0

if the ith bit is 0: Z(i) = Z(i-1), X(i) = min( Z(i-1), X(i-1)+1 )

if the ith bit is 1: Z(i) = Z(i-1)+1, X(i) = X(i-1)

The answer is X(n).

It's even easier in code:

z=0
x=0
for bit in sequence:
    if bit == 0:
       x = min(z,x+1)
    else:
       z = z+1
return x

How do I count the number of zero bits in an integer?

The easiest most naive way is to just iterate over the bits and count:

size_t num_zeroes = 0;

for(size_t i = 0; i < CHAR_BIT * sizeof value; ++i)
{
  if ((value & (1 << i)) == 0)
    ++num_zeroes;
}

There are all number of better (for different values of "better") ways, but this is quite clear, very terse (code-wise), and doesn't require a bunch of setup.

One micro-optimization that might be considered an improvement is to not compute the mask to test each bit, instead shift the value and always test the rightmost bit:

for(size_t i = 0; i < CHAR_BIT * sizeof value; ++i, value >>= 1)
{
  if ((value & 1) == 0)
    ++num_zeroes;
}

Fast Way of Counting Non-Zero Bits in Positive Integer