Why does shifting a variable by more than its width in bits zeroes out?
The linked questions specifically state that shifting by an amount greater than the bit width of the type being shifted invokes undefined behavior, which the standard defines as "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements"
When you invoke undefined behavior, anything can happen. The program may crash, it may output strange results, or it may appear to work properly. Also, how undefined behavior manifests itself can change if you use a different compiler or different optimization settings on the same compiler.
The C standard states the following regarding the bitwise shift operators in section 6.5.7p3:
The integer promotions are performed on each of the operands. The
type of the result is that of the promoted left operand. If
the value of the right operand is negative or is greater than
or equal to the width of the promoted left operand, the behavior is
undefined.
In this case it's possible that the compiler could reduce the amount to shift modulo the bit width, as you suggested, or it could treat it as mathematically shifting by that amount resulting in all bits being 0. Either is a valid result because the standard does not specify the behavior.
C strange behaviour when bit-shifting
Assuming int and unsigned int are 32 bit wide, the integer constant 0xffffffff
is of type unsigned int.
Right shifting an integer with a value of greater or equal the width of the integer will result in undefined behavior1.
This happens in both cases in your example.
Update:
Undefined behavior means that anything can happen. Getting the value 0xffffffff instead of 0 fits this behavior.
You cannot shift by the width of the integer type, because it says so in the standard. You must make a check if the value is greater or equal the width of the type your working with.
1 (Quoted from: ISO/IEC 9899:201x 6.5.7 Bitwise shift operators 3)
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
Why doesn't left bit shift shift beyond 31 for long int datatype?
Although your x
is of type long int
, the 1
is not. 1
is an int
, so 1<<63
is indeed undefined.
Try (static_cast<long int>(1) << 63)
, or 1L << 63
as suggested by Wojtek.
What happens if we bitwise shift an integer more than its size
Undefined behavior happens. That's standardized by the language standard (See section 6.5.7 Bitwise shift operators
of the C standard from 1999). The observable effects of UB are not standardized, however, and may vary.
As for shl eax, 22h
, the shift count is going to be truncated to 5 bits by the CPU (see the documentation) and this instruction will really behave as shl eax, 2
.
Why does it make a difference if left and right shift are used together in one expression or not?
This little test is actually more subtle than it looks as the behavior is implementation defined:
unsigned char x = 255;
no ambiguity here,x
is anunsigned char
with value255
, typeunsigned char
is guaranteed to have enough range to store255
.printf("%x\n", x);
This producesff
on standard output but it would be cleaner to writeprintf("%hhx\n", x);
asprintf
expects anunsigned int
for conversion%x
, whichx
is not. Passingx
might actually pass anint
or anunsigned int
argument.unsigned char tmp = x << 7;
To evaluate the expressionx << 7
,x
being anunsigned char
first undergoes the integer promotions defined in the C Standard 6.3.3.1: If anint
can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to anint
; otherwise, it is converted to anunsigned int
. These are called the integer promotions.So if the number of value bits in
unsigned char
is smaller or equal to that ofint
(the most common case currently being 8 vs 31),x
is first promoted to anint
with the same value, which is then shifted left by7
positions. The result,0x7f80
, is guaranteed to fit in theint
type, so the behavior is well defined and converting this value to typeunsigned char
will effectively truncate the high order bits of the value. If typeunsigned char
has 8 bits, the value will be128
(0x80
), but if typeunsigned char
has more bits, the value intmp
can be0x180
,0x380
,0x780
,0xf80
,0x1f80
,0x3f80
or even0x7f80
.If type
unsigned char
is larger thanint
, which can occur on rare systems wheresizeof(int) == 1
,x
is promoted tounsigned int
and the left shift is performed on this type. The value is0x7f80U
, which is guaranteed to fit in typeunsigned int
and storing that totmp
does not actually lose any information since typeunsigned char
has the same size asunsigned int
. Sotmp
would have the value0x7f80
in this case.unsigned char y = tmp >> 7;
The evaluation proceeds the same as above,tmp
is promoted toint
orunsigned int
depending on the system, which preserves its value, and this value is shifted right by 7 positions, which is fully defined because7
is less than the width of the type (int
orunsigned int
) and the value is positive. Depending on the number of bits of typeunsigned char
, the value stored iny
can be1
,3
,7
,15
,31
,63
,127
or255
, the most common architecture will havey == 1
.printf("%x\n", y);
again, it would be better t writeprintf("%hhx\n", y);
and the output may be1
(most common case) or3
,7
,f
,1f
,3f
,7f
orff
depending on the number of value bits in typeunsigned char
.unsigned char z = (x << 7) >> 7;
The integer promotion is performed onx
as described above, the value (255
) is then shifted left 7 bits as anint
or anunsigned int
, always producing0x7f80
and then right shifted by 7 positions, with a final value of0xff
. This behavior is fully defined.printf("%x\n", z);
Once more, the format string should beprintf("%hhx\n", z);
and the output would always beff
.
Systems where bytes have more than 8 bits are becoming rare these days, but some embedded processors, such as specialized DSPs still do that. It would take a perverse system to fail when passed an unsigned char
for a %x
conversion specifier, but it is cleaner to either use %hhx
or more portably write printf("%x\n", (unsigned)z);
Shifting by 8
instead of 7
in this example would be even more contrived. It would have undefined behavior on systems with 16-bit int
and 8-bit char
.
Bitshift on structures
Is shifting structures like this possible?
No, not really. Even if the x
and y
members are in adjacent memory locations, bit-shift operations on either are performed as integer operations on the individual variables. So, you can't shift bits "out of" one and "into" the other: bits that "fall off" during the shift will be lost.
Is there a better alternative?
You would have to implement such a multi-component bit-shift yourself – making copies of the bits that would otherwise be lost and somehow masking those back into the result, after shifting other bits internally to each 'component' variable. Exactly how to do this would largely depend on the use case.
Here's one possible implementation of a right-shift function for a structure comprising two uint64_t
members (I have not added any error-checking for the count
, and I assume that uint64_t
is exactly 64 bits wide):
#include <stdio.h>
#include <stdint.h>
typedef struct {
uint64_t hi;
uint64_t lo;
} ui128;
void Rshift(ui128* data, int count)
{
uint64_t mask = (1uLL << count) - 1; // Set low "count" bits to 1
uint64_t save = data->hi & mask; // Save bits that fall off hi
data->hi >>= count; // Shift the hi component
data->lo >>= count; // Shift the lo component
data->lo |= save << (64 - count); // Mask in the bits from hi
return;
}
int main()
{
ui128 test = { 0xF001F002F003F004, 0xF005F006F007F008 };
printf("%016llx%016llx\n", test.hi, test.lo);
Rshift(&test, 16);
printf("%016llx%016llx\n", test.hi, test.lo);
return 0;
}
A similar logic could be used for a left-shift function, but you would then need to save the relevant upper (most significant) bits from the lo
member and mask them into the shifted hi
value:
void Lshift(ui128* data, int count)
{
uint64_t mask = ((1uLL << count) - 1) << (64 - count);
uint64_t save = data->lo & mask;
data->hi <<= count;
data->lo <<= count;
data->hi |= save >> (64 - count);
return;
}
What are bitwise shift (bit-shift) operators and how do they work?
The bit shifting operators do exactly what their name implies. They shift bits. Here's a brief (or not-so-brief) introduction to the different shift operators.
The Operators
>>
is the arithmetic (or signed) right shift operator.>>>
is the logical (or unsigned) right shift operator.<<
is the left shift operator, and meets the needs of both logical and arithmetic shifts.
All of these operators can be applied to integer values (int
, long
, possibly short
and byte
or char
). In some languages, applying the shift operators to any datatype smaller than int
automatically resizes the operand to be an int
.
Note that <<<
is not an operator, because it would be redundant.
Also note that C and C++ do not distinguish between the right shift operators. They provide only the >>
operator, and the right-shifting behavior is implementation defined for signed types. The rest of the answer uses the C# / Java operators.
(In all mainstream C and C++ implementations including GCC and Clang/LLVM, >>
on signed types is arithmetic. Some code assumes this, but it isn't something the standard guarantees. It's not undefined, though; the standard requires implementations to define it one way or another. However, left shifts of negative signed numbers is undefined behaviour (signed integer overflow). So unless you need arithmetic right shift, it's usually a good idea to do your bit-shifting with unsigned types.)
Left shift (<<)
Integers are stored, in memory, as a series of bits. For example, the number 6 stored as a 32-bit int
would be:
00000000 00000000 00000000 00000110
Shifting this bit pattern to the left one position (6 << 1
) would result in the number 12:
00000000 00000000 00000000 00001100
As you can see, the digits have shifted to the left by one position, and the last digit on the right is filled with a zero. You might also note that shifting left is equivalent to multiplication by powers of 2. So 6 << 1
is equivalent to 6 * 2
, and 6 << 3
is equivalent to 6 * 8
. A good optimizing compiler will replace multiplications with shifts when possible.
Non-circular shifting
Please note that these are not circular shifts. Shifting this value to the left by one position (3,758,096,384 << 1
):
11100000 00000000 00000000 00000000
results in 3,221,225,472:
11000000 00000000 00000000 00000000
The digit that gets shifted "off the end" is lost. It does not wrap around.
Logical right shift (>>>)
A logical right shift is the converse to the left shift. Rather than moving bits to the left, they simply move to the right. For example, shifting the number 12:
00000000 00000000 00000000 00001100
to the right by one position (12 >>> 1
) will get back our original 6:
00000000 00000000 00000000 00000110
So we see that shifting to the right is equivalent to division by powers of 2.
Lost bits are gone
However, a shift cannot reclaim "lost" bits. For example, if we shift this pattern:
00111000 00000000 00000000 00000110
to the left 4 positions (939,524,102 << 4
), we get 2,147,483,744:
10000000 00000000 00000000 01100000
and then shifting back ((939,524,102 << 4) >>> 4
) we get 134,217,734:
00001000 00000000 00000000 00000110
We cannot get back our original value once we have lost bits.
Arithmetic right shift (>>)
The arithmetic right shift is exactly like the logical right shift, except instead of padding with zero, it pads with the most significant bit. This is because the most significant bit is the sign bit, or the bit that distinguishes positive and negative numbers. By padding with the most significant bit, the arithmetic right shift is sign-preserving.
For example, if we interpret this bit pattern as a negative number:
10000000 00000000 00000000 01100000
we have the number -2,147,483,552. Shifting this to the right 4 positions with the arithmetic shift (-2,147,483,552 >> 4) would give us:
11111000 00000000 00000000 00000110
or the number -134,217,722.
So we see that we have preserved the sign of our negative numbers by using the arithmetic right shift, rather than the logical right shift. And once again, we see that we are performing division by powers of 2.
Related Topics
How to Run Specific Test Cases in Googletest
What Is _Declspec and When Do I Need to Use It
Why Do You Use Typedef When Declaring an Enum in C++
Using Maven for C/C++ Projects
Removing Watermark Out of an Image Using Opencv
Initializing the Size of a C++ Vector
Do All Virtual Functions Need to Be Implemented in Derived Classes
Does Making a Struct Volatile Make All Its Members Volatile
How to Call on a Function Found on Another File
Is There an Online Name Demangler for C++
Determine If Map Contains a Value for a Key