What Is Going on with Bitwise Operators and Integer Promotion

What is going on with bitwise operators and integer promotion?

[expr.unary.op]

The operand of ~ shall have integral or unscoped enumeration type; the
result is the one’s complement of　its operand. Integral promotions are
performed.

[expr.shift]

The shift operators << and >> group left-to-right. [...] The operands shall be of integral or unscoped enumeration type and integral promotions are performed.

What's the integral promotion of uint8_t (which is usually going to be unsigned_char behind the scenes)?

[conv.prom]

A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank of
int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be
converted to a prvalue of type unsigned int.

So int, because all of the values of a uint8_t can be represented by int.

What is int(12) << 1 ? int(24).

What is ~int(12) ? int(-13).

How to avoid integral promotion for bitwise operations

Why is value promoted to a larger type? Because the language spec says it is (a 16-bit unsigned short will be converted to a 32-bit int). 16-bit ops on x86 actually incur a penalty over the corresponding 32 bit ones (due to a prefix opcode), so the 32 bit version just may run faster.

Integer promotion with the operator

The phrase "the integer promotions" is a very specific thing, found in (for C99) section 6.3.1.1 Booleans, characters, and integers:

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

So assuming your unsigned char can be held in an int, it will be promoted to an int. On those rare platforms where unsigned char is as wide as an int, it will promote to an unsigned int.

This is only changed slightly in C11:

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

If a specific compiler doesn't follow this behaviour, then it's not really conforming. However, given that the compiler you listed is for embedded systems, it's not really surprising.

Many are built for specific purposes and conformance is not always high on the list of requirements. There may be compiler flags that will allow it to more closely conform to the standard.

Looking at your particular environment, the M16C Series,R8C Family C Compiler Package V.5.45 C Compiler has, in section 2.1.4 nc30 Command Line Options, subsection f. Generated code modification options:

-fextend_to_int, -fETI: Performs operation after extending char-type data to the int type. Extended according to ANSI standards.

although I suspect -fansi is probably a better choice since it covers a few other things as well.

Conversion Warning with Bitwise Or and Casted Operands

Both the bitwise OR operator | and the bitwise AND operator & perform integer promotions on both operands. The fact that one of the operands is the result of a cast doesn't change this.

The -Wconversion flag tends to be a bit overenthusiastic regarding what it warns on. Given a 32-bit int, a conversion from uint16_t to int will not change its value.

The C standard does specify that the minimum range of an int is -32767 to 32767, so an implementation with this limit could potentially see a value change, though you'll be hard-pressed to find a system the runs gcc where this is the case.

This particular warning is silenced however by a cast. From the man page:

-Wconversion
Warn for implicit conversions that may alter a value. This includes
conversions between real and integer, like "abs (x)" when "x" is
"double"; conversions between signed and unsigned, like "unsigned
ui = -1"; and conversions to smaller types, like "sqrtf (M_PI)". Do
not warn for explicit casts like "abs ((int) x)" and "ui =
(unsigned) -1", or if the value is not changed by the conversion like
in "abs (2.0)". Warnings about conversions between signed and
unsigned integers can be disabled by using -Wno-sign-conversion.
For C++, also warn for confusing overload resolution for
user-defined conversions; and conversions that never use a type
conversion operator: conversions to "void", the same type, a base
class or a reference to them. Warnings about conversions between
signed and unsigned integers are disabled by default in C++ unless
-Wsign-conversion is explicitly enabled.

Why does bitwise left shift promotes an uint8_t to a wider type

Pretty much every arithmetic operation performs what's called the usual arithmetic conversions.

This goes back decades.

First, integral promotions are performed.

No arithmetic operation takes uint8_t so both of your operands will always be promoted.

After that, a common type is found and conversions take place if necessary.

You can prevent this by casting the right-hand-side to the type of i but, per the above rule, that doesn't get you anything in this case.

(You can learn more about this process here and here.)

The upshot is that the result of your expression is never going to be uint8_t; it's just that in the case of j you've cast it back to uint8_t, with the wraparound that consequently ensues.

Why the sequence from the bitwise operator(~) would be this? Is that broken?

According to the standard, the operand of ~ will undergo integral promotion. So here we will first promote a to int.

[expr.unary.op]: The operand of ~ shall have integral or unscoped enumeration type; the result is the ones' complement of its operand. Integral promotions are performed.

If int is 4 bytes (for example), the value of the promoted a is 0x00000064. The result of ~a is 0xFFFFFF9B, which is exactly -101(If using two's complement to represent integers).

Please note that although variadic arguments will undergo integral promotion, here ~a is of type int and no additional promotion is required.

Is left shifting unsigned int more than its bit field width, but less than its type size undefined?

There will be references to the integer promotion of the left operand. The following is the relevant promotion:

6.3.1.1.2 [...] If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; [...]

The promoted left operand is an int.

About shifting, the spec says

6.5.7.3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

The width of the promoted left operand — the width of an int — is at least 16. 5 is much less than 16.

No undefined behaviour yet.

The spec goes on:

6.5.7.4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

The "type of E1" refers to the type of bar.var after promotion.

E1 has an signed type. In this case, E1 can't possibly be negative, and no value of E1 multiplied by 2⁵ would exceed what an int can represent.

No undefined behaviour yet.

Finally, we have the assignment.

6.5.16.1.2 In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.

6.3.1.3.2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.⁶⁰⁾

No undefined behaviour there either.

Bit operations with integer promotion

If you use unsigned types, all will be OK. The standard mandates that for unsigned target integer types, narrowing is perfectly defined:

4.7 Integral conversions [conv.integral]

...

2 If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source
integer (modulo 2n where n is the number of bits used to represent the unsigned type).

But if the target type is signed, the result is implementation defined, per the next paragraph (emphasize mine):

3 If the destination type is signed, the value is unchanged if it can be represented in the destination type;
otherwise, the value is implementation-defined.

In common implementations everything will be ok because it is simpler for the compiler to simply do narrowing conversions by only keeping low level bytes for either unsigned or signed types. But the standard only requires that the implementation defines what will happen. An implementation could document that narrowing a value to a signed type when the original value cannot be represented in the target type gives 0, and still be conformant.

By the way, as C++ and C often process conversions the same way, it should be noted that C standard is slightly different because the last case could raise a signal:

6.3.1.3 [Conversions] Signed and unsigned integers

...
3 Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.

Still a confirmation that C and C++ are different languages...

What Is Going on with Bitwise Operators and Integer Promotion