Is Right Shift Undefined Behavior If the Count Is Larger Than the Width of the Type

Is right shift undefined behavior if the count is larger than the width of the type?

The draft C++ standard in section 5.8 Shift operators in paragraph 1 says(emphasis mine):

The type of the result is that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.

So if unsigned int is 32 bits or less then this is undefined which is exactly the warning that gcc is giving you.

shift count greater than width of type

The problem is simple. You're using data_length as int when it should be unsigned as negative lengths hardly make sense. Also to be able to shift 56 bits the value must be at least 56 57 bits wide. Otherwise the behaviour is undefined.

In practice processors are known to do wildly different things. In one, shifting a 32-bit value right by 32 bits will clear the variable. In another, the value is shifted by 0 bits (32 % 32!). And then in some, perhaps the processor considers it invalid opcode and the OS kills the process.

Simple solution: declare uint64_t data_length.


If you really have limited yourself to 32-bit datatypes, then you can just assign 0 to these bytes that signify the most significant bytes. Or just cast to uint64_t or unsigned long long before the shift.

When can bit shifting cause undefined behaviour

The C99 standard says this about bitwise shift operators (emphasis added, and ^ is used to represent exponentiation):

§6.5.7.3: The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

§6.5.7.4: The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and non-negative value, and E1 × 2^E2
is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

**§6.5.7.5
: The result of E1 >> E2 is E1 right-shifted E2 bit positions. If
E1 has an unsigned type or if E1 has a signed type and a non-negative value, the value of the result is the integral part of the quotient of E1 / 2^E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.

So, to summarise, behaviour is undefined if any of the following are true:

  • the right operand is signed and negative, or
  • the right operand is greater than or equal to the width of the left operand (after promotion), or
  • the left operand is signed and negative, or
  • a left shift is performed and the left operand is signed and the resultant value is not representable as a signed integer

Undefined behavior of right-shift in C++

One of the goals of C++ is to allow for fast, efficient code, "close to the hardware". And on most hardware, an integer right shift or left shift can be implemented by a single opcode. The trouble is, different CPUs have different behavior in this case where the shift magnitude is more than the number of bits.

So if C++ mandated a particular behavior for shift operations, when producing code for a CPU whose opcode behavior doesn't match all the Standard requirements, compilers would need to insert checks and logic to make sure the result is as defined by the Standard in all cases. This would need to happen to almost all uses of the built-in shift operators, unless an optimizer can prove the corner case won't actually happen. The added checks and logic would potentially slow down the program.

Why does shifting a variable by more than its width in bits zeroes out?

The linked questions specifically state that shifting by an amount greater than the bit width of the type being shifted invokes undefined behavior, which the standard defines as "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements"

When you invoke undefined behavior, anything can happen. The program may crash, it may output strange results, or it may appear to work properly. Also, how undefined behavior manifests itself can change if you use a different compiler or different optimization settings on the same compiler.

The C standard states the following regarding the bitwise shift operators in section 6.5.7p3:

The integer promotions are performed on each of the operands. The
type of the result is that of the promoted left operand. If
the value of the right operand is negative or is greater than
or equal to the width of the promoted left operand, the behavior is
undefined.

In this case it's possible that the compiler could reduce the amount to shift modulo the bit width, as you suggested, or it could treat it as mathematically shifting by that amount resulting in all bits being 0. Either is a valid result because the standard does not specify the behavior.

Undefined behaviour of right shift (a b) when b is greater than the number of bits in a?

We can get an idea of why languages choose undefined behavior from Why Language Designers Tolerate Undefined Behavior and it says:

This answer came from two general design principles behind C:

  1. The language should impose no unnecessary overhead on the implementation.
  2. It should be as easy as possible to implement C on a wide variety of hardware.

in this specific case what happens when we use a shift count larger than the bit width will depend on the architecture for example as I explain in my answer here:

on some platforms the shift count will be masked to 5 bits for example on an x86 architecture we can see the Intel® 64 and IA-32 Architectures Software Developer’s Manual section SAL/SAR/SHL/SHR—Shift in the IA-32 Architecture Compatibility section says:

The 8086 does not mask the shift count. However, all other IA-32 processors (starting with the Intel 286 processor) do mask the shift count to 5 bits, resulting in a maximum count of 31. [...]

So implementing shift for an arbitrary count may be burdensome on some platforms and therefore it is better to leave it undefined behavior.

Why not unspecified behavior

If we look at the Rationale for International Standard—Programming Languages—C it says:

Unspecified behavior gives the implementor some latitude in translating programs. This latitude
does not extend as far as failing to translate the program, however, because all possible behaviors
are “correct” in the sense that they don’t cause undefined behavior in any implementation.

So there must have been a case or still exists a case where the behavior is not correct and would have bad issues.

bits shift exceeding width of type in C

The constant 511 has type int. Your system most likely has a 32-bit int, so this means you're shifting a value by an amount larger than its bit length. Doing so triggers undefined behavior.

This is dictated by section 6.5.7p3 of the C standard regarding bitwise shift operators:

The integer promotions are performed on each of the operands. The type
of the result is that of the promoted left operand. If the
value of the right operand is negative or is greater than or
equal to the width of the promoted left operand, the behavior is
undefined.

You can fix this by using the ULL suffix on the constant, which will give it type unsigned long long which is guaranteed to be at least 64 bits in length.

uint64_t nineMSB = (vpn & (511ULL << 36)) >> 36;

c++ - warning: right shift count = width of type on 32 bit machine

unsigned long is only guaranteed to have 32 bits. See here. You need to use unsigned long long to have 64 bits guaranteed.

Even better would be to use a fixed width integer, i.e. uint64_t. They are defined in header <cstdint> (or <stdint.h>).

Shifting unsigned int more than the size of it, undefined or not?

The paragraph above the one you quoted says the same thing, regardless of the signedness:

6.5.7 Bitwise shift operators / 3 The integer promotions are performed on each of the operands. The type of the result is
that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

So, it's UB, whether it is unsigned or signed.

64bit shift problem

When you shift a value by more bits than word size, it usually gets shifted by mod word-size. Basically, shifting it by 64 means shifting by 0 bits which is equal to no shifting at all. You shouldn't rely on this though as it's not defined by the standard and it can be different on different architectures.



Related Topics



Leave a reply



Submit