Fastest Way to Produce a Mask with N Ones Starting at Position I

Fastest way to produce a mask with n ones starting at position i

If by "starting at pos", you mean that the lowest-order bit of the mask is at the position corresponding with 2^pos (as in your example):

((UIntType(1) << len) - UIntType(1)) << pos

If it is possible that len is ≥ the number of bits in UIntType, avoid Undefined Behaviour with a test:

(((len < std::numeric_limits<UIntType>::digits)
     ? UIntType(1)<<len
     : 0) - UIntType(1)) << pos

(If it is also possible that pos is ≥ std::numeric_limits<UIntType>::digits, you'll need another ternary op test.)

You could also use:

(UIntType(1)<<(len>>1)<<((len+1)>>1) - UIntType(1)) << pos

which avoids the ternary op at the cost of three extra shift operators; I doubt whether it would be faster but careful benchmarking would be necessary to know for sure.

Bit manipulation -- generate mask for first N set bits

Some recent CPUs have the pdep instruction with which this is easy:

m = bitmask of n ones
return pdep(m, x)

Otherwise a stepwise approach as in @olegarch's solution is probably unavoidable. One with slightly fewer instructions is as follows:

unsigned getmask(unsigned x, int n) {
    unsigned x1 = -x;
    for (int i = 0; i < n - 1; ++i)
        x1 = x1 - (x ^ x1);
    return x & x1;
}

What is the best practice way to create a bitmask for a range of bits?

You could try a lookup table approach:

static const char LUT[][] = { // index like this LUT[bot][top]
//top:    0     1     2     3     4     5     6     7     8
       0x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F, 0xFF, // bot: 0
       0x00, 0x00, 0x02, 0x06, 0x0E, 0x1E, 0x3E, 0x7E, 0xFE, // bot: 1
       0x00, 0x00, 0x00, 0x04, 0x0C, 0x1C, 0x3C, 0x7C, 0xFC, // bot: 2
       0x00, 0x00, 0x00, 0x00, 0x00, 0x18, 0x38, 0x78, 0xF8, // bot: 3
       0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x30, 0x70, 0xF0, // bot: 4
       0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x60, 0xE0, // bot: 5
       0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0xC0, // bot: 6
       0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80, // bot: 7
       0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00  // bot: 8
};

char mask = LUT[bot][top];

Also: If for whatever reason you go with bit manipulation this solution requires less ops. In addition a superscalar processor should evaluate the left and right side of the xor in parallel.

char mask = (0xFF << top) ^ (0xFF << bot);

Algorithm to generate bit mask

One thing to notice about bitmasks like that is that they are always one less than a power of two.

The expression 1 << n is the easiest way to get the n-th power of two.

You don't want Zero to provide a bitmask of 00000001, you want it to provide zero. So you need to subtract one.

mask = (1 << param) - 1;

Edit:

If you want a special case for param > 32:

int sizeInBits = sizeof(mask) * BITS_PER_BYTE; // BITS_PER_BYTE = 8;
mask = (param >= sizeInBits ? -1 : (1 <<  param) - 1);

This method should work for 16, 32, or 64 bit integers, but you may have to explicitly type the '1'.

Create mask by first positions only

You can use np.cumsum.

Assumption: you are looking for zeros only at the start of each row.

a = np.array([[ 0,  1,  2,  0,  0,  0],
              [ 0,  4,  1, 35,  0, 10],
              [ 0,  0,  5,  4,  0,  4]])

a.cumsum(axis=1) == 0
array([[ True, False, False, False, False, False],
       [ True, False, False, False, False, False],
       [ True,  True, False, False, False, False]], dtype=bool)

Basis: holds True for as long as the cumulative sum is 0 along each row.

Error-prone: an array with negative ints would cause this to fail. I.e. for [-1, 1], this would evaluate to True at position 1.

N permutations of binary mask with order

You can take advantage of the fact that itertools.combinations returns the combinations in the definite order that you're looking for.

We can generate combinations of 0, 1, 2 ... 4 positions where the '1' bits have to be, and create your string accordingly.

A generator yielding your masks could be:

def masks(ideal):
    size = len(ideal)
    positions = list(range(size))  # will be for example [0, 1, 2, 3]

    # we start with 0 flipped bit, then one, up to all
    for nb_flipped in range(0, size+1):
        # we get all combinations of positions where to flip that many bits
        for flipped_positions in combinations(positions, r=nb_flipped):
            out = list(ideal)
            # and we flip the bits at these positions
            for pos in flipped_positions:
                out[pos] =  '1' if out[pos] == '0' else '0'
            yield ''.join(out)

And you can use it like this:

for m in masks('0000'):
    print(m)
    
# 0000
# 1000
# 0100
# 0010
# 0001
# 1100
# 1010
# 1001
# 0110
# 0101
# 0011
# 1110
# 1101
# 1011
# 0111 
# 1111

If you want a list of the n first ones, you could use a function like:

def list_of_masks(ideal, n):
    return list(islice(masks(ideal), n))

On your sample ideal mask, this gives:

print(list_of_masks('0101', 6))

# ['0101', '1101', '0001', '0111', '0100', '1001']

Numpy fastest way to create mask array around indexes

Find the elements larger than 2 then set the elements around them to True:

a = np.array([0, 0, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 0, 3, 0, 0, 2, 0])

N = 3
mask = a > 2
center = np.where(mask)[0]
mask[np.maximum(np.ravel(center - np.arange(1, 1 + N).reshape(-1, 1)), 0)] = True
mask[np.minimum(np.ravel(center + np.arange(1, 1 + N).reshape(-1, 1)), len(a)-1)] = True

Thanks @Michael Szczesny for pointing out the edge case. The maximum and minimum ensures that the index would not (unintentionally) go out of bounds.

Fastest Way to Produce a Mask with N Ones Starting at Position I