Why Doesn't Left Bit-Shift, "≪≪", For 32-Bit Integers Work as Expected When Used More Than 32 Times

Why doesn't left bit-shift, , for 32-bit integers work as expected when used more than 32 times?

This is caused due to a combination of an undefined behaviour in C and the fact that code generated for IA-32 processors has a 5 bit mask applied on the shift count. This means that on IA-32 processors, the range of a shift count is 0-31 only. ¹

From The C programming language ²

The result is undefined if the right operand is negative, or greater than or equal to the number of bits in the left expression’s type.

From IA-32 Intel Architecture Software Developer’s Manual ³

The 8086 does not mask the shift count. However, all other IA-32 processors (starting with the Intel 286 processor) do mask the shift count to 5 bits, resulting in a maximum count of 31. This masking is done in all operating modes (including the virtual-8086 mode) to reduce the maximum execution time of the instructions.

¹ http://codeyarns.com/2004/12/20/c-shift-operator-mayhem/

² A7.8 Shift Operators, Appendix A. Reference Manual, The C Programming Language

³ SAL/SAR/SHL/SHR – Shift, Chapter 4. Instruction Set Reference, IA-32 Intel Architecture Software Developer’s Manual

Left shift an integer by 32 bits

§5.8 Shift operators

The type of the result is that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.
The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1×2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

How to shift 32 bit int by 32 (yet again)

In Java it's guaranteed to work if those variables are of type int, since >> in Java does an arithmetic right shift and shifting more than 31 also has defined behavior. But beware of operator precedence

int lshift(int x, int n)
{
    return (x << n) & ((n-32) >> 5);
}

This will work for shift count up to 32. But it can be modified to include any int values with shift counts larger than 31 return 0

return (x << n) & ((n-32) >> 31);

However in C and C++ the size of int type and the behavior of >> operator is implementation defined. Most (if not all) modern implementations implement it as arithmetic shift for signed types though. Besides, the behavior of shifting more than the variable width is undefined. Worse yet, signed overflow invokes UB so even a left shift by 31 is also UB (until C++14). Therefore to get a well defined output you need to

Use a unsigned fixed-width type like uint32_t (so x << 31 isn't UB)
Use a compiler that emits arithmetic right shift instruction for >> and use a signed type for n, or implement the arithmetic shift yourself
Mask the shift amount to limit it to 5 bits for int32_t

The result would be

uint32_t lshift(uint32_t x, int32_t n)
{
    return (x << (n & 0x1F)) & ((n-32) >> 31);
}

If the architecture supports conditional instructions like x86 or ARM then the following way may be faster

return n < 32 ? x << n : 0;

On a 64-bit platform you can made this even simpler by shifting in a 64-bit type and then mask. Some 32-bit platforms like ARM does support shifting by 32 so this method is also efficient

return ((uint64_t)x << (n & 0x3F)) & 0xFFFFFFFFU;

You can see the output assembly here. I don't see how it can be improved further

Why doesn't left bit shift shift beyond 31 for long int datatype?

Although your x is of type long int, the 1 is not. 1 is an int, so 1<<63 is indeed undefined.

Try (static_cast<long int>(1) << 63), or 1L << 63 as suggested by Wojtek.

Bit shifting an int 32 times in C

It's undefined behavior

Why doesn't left bit-shift, "<<", for 32-bit integers work as expected when used more than 32 times?

which means you should avoid it like the plague. Here's some links including "What every C programmer should know about undefined behavior": http://lwn.net/Articles/511767/

How do I bit shift a long by more than 32 bits?

Re-try this using a variable of type uint64_t (from stdint.h) instead of long. uint64_t is guaranteed to be 64 bits long and should behave as you expect.

C strange behaviour when bit-shifting

Assuming int and unsigned int are 32 bit wide, the integer constant 0xffffffff is of type unsigned int.

Right shifting an integer with a value of greater or equal the width of the integer will result in undefined behavior¹.

This happens in both cases in your example.

Update:

Undefined behavior means that anything can happen. Getting the value 0xffffffff instead of 0 fits this behavior.

You cannot shift by the width of the integer type, because it says so in the standard. You must make a check if the value is greater or equal the width of the type your working with.

¹ (Quoted from: ISO/IEC 9899:201x 6.5.7 Bitwise shift operators 3)

If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

Why use only the lower five bits of the shift operand when shifting a 32-bit value? (e.g. (UInt32)1 33 == 2)

It basically boils down to the way the x86 handles the arithmetic shift opcodes: it only uses the bottom 5 bits of the shift count. See the 80386 programming guide, for example. In C/C++, it's technically undefined behavior to do a bit shift by more than 31 bits (for a 32-bit integer), going with the C philosophy of "you don't pay for what you don't need". From section 6.5.7, paragraph 3 of the C99 standard:

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is
greater than or equal to the width of the promoted left operand, the behavior is undefined.

This allows compilers to omit a single shift instruction on x86 for shifts. 64-bit shifts cannot be done in one instruction on x86. They use the SHLD/SHRD instructions plus some additional logic. On x86_64, 64-bit shifts can be done in one instruction.

For example, gcc 3.4.4 emits the following assembly for a 64-bit left-shift by an arbitrary amount (compiled with -O3 -fomit-frame-pointer):

uint64_t lshift(uint64_t x, int r)
{
  return x << r;
}

_lshift:
    movl    12(%esp), %ecx
    movl    4(%esp), %eax
    movl    8(%esp), %edx
    shldl   %cl,%eax, %edx
    sall    %cl, %eax
    testb   $32, %cl
    je      L5
    movl    %eax, %edx
    xorl    %eax, %eax
L5:
    ret

Now, I'm not very familiar with C#, but I'm guessing it has a similar philosophy -- design the language to allow it to be implemented as efficiently as possible. By specifying that shift operations only use the bottom 5/6 bits of the shift count, it permits the JIT compiler to compile the shifts as optimally as possible. 32-bit shifts, as well as 64-bit shifts on 64-bit systems, can get JIT compiled into a single opcode.

If C# were ported to a platform that had different behavior for its native shift opcodes, then this would actually incur an extra performance hit -- the JIT compiler would have to ensure that the standard is respected, so it would have to add extra logic to make sure only the bottom 5/6 bits of the shift count were used.

Why Doesn't Left Bit-Shift, "≪≪", For 32-Bit Integers Work as Expected When Used More Than 32 Times