How do promotion rules work when the signedness on either side of a binary operator differ?
This is outlined explicitly in §5/9:
Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:
- If either operand is of type
long double
, the other shall be converted tolong double
.- Otherwise, if either operand is
double
, the other shall be converted todouble
.- Otherwise, if either operand is
float
, the other shall be converted tofloat
.- Otherwise, the integral promotions shall be performed on both operands.
- Then, if either operand is
unsigned long
the other shall be converted tounsigned long
.- Otherwise, if one operand is a
long int
and the otherunsigned int
, then if along int
can represent all the values of anunsigned int
, theunsigned int
shall be converted to along int
; otherwise both operands shall be converted tounsigned long int
.- Otherwise, if either operand is
long
, the other shall be converted tolong
.- Otherwise, if either operand is
unsigned
, the other shall be converted tounsigned
.
[Note: otherwise, the only remaining case is that both operands are
int
]
In both of your scenarios, the result of operator+
is unsigned
. Consequently, the second scenario is effectively:
int result = static_cast<int>(us + static_cast<unsigned>(neg));
Because in this case the value of us + neg
is not representable by int
, the value of result
is implementation-defined – §4.7/3:
If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.
Strange type deduction
Due to the usual arithmetic conversions if two operands have the same conversion rank and one of the operands has unsigned integer type then the type of the expression has the same unsigned integer type.
From the C++ 17 Standard (5 Expressions, p.#10)
— Otherwise, if the operand that has unsigned integer type has rank
greater than or equal to the rank of the type of the other operand,
the operand with signed integer type shall be converted to the type of
the operand with unsigned integer type.
Pay attention to that the conversion rank of the type unsigned int
is equal to the rank of the type int
(signed int
). From the C++ 17 Standard (4.13 Integer conversion rank, p.#1)
— The rank of any unsigned integer type shall equal the rank of the
corresponding signed integer type
A more interesting example is the following. Let's assume that there are two declarations
unsigned int x = 0;
long y = 0;
and the width of the both types is the same and equal for example to 4
bytes. As it is known the rank of the type long
is greater than the rank of the type unsigned int
. A question arises what id the type of the expression
x + y
The type of the expression is unsigned long
.:)
Here is a demonstrative program but instead of the types long
and unsigned int
there are used the types long long
and unsigned long
.
#include <iostream>
#include <iomanip>
#include <type_traits>
int main()
{
unsigned long int x = 0;
long long int y = 0;
std::cout << "sizeof( unsigned long ) = "
<< sizeof( unsigned long )
<< '\n';
std::cout << "sizeof( long long ) = "
<< sizeof( long long )
<< '\n';
std::cout << std::boolalpha
<< std::is_same<unsigned long long, decltype( x + y )>::value
<< '\n';
return 0;
}
The program output is
sizeof( unsigned long ) = 8
sizeof( long long ) = 8
true
That is the type of the expression x + y
is unsigned long long
though neither operand of the expression has this type.
C++ Integer Overflow and Promotion
a) From §5/9:
Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:
- If either operand is of type
long double
, the other shall be
converted tolong double
.- Otherwise, if either operand is
double
, the other shall be converted todouble
.- Otherwise, if either operand is
float
, the other shall be converted tofloat
.- Otherwise, the integral promotions (4.5) shall be performed on both operands.
- Then, if either operand is
unsigned long
the other shall be converted tounsigned long
.- Otherwise, if one operand is a
long int
and the otherunsigned int
, then if along int
can represent all the values of anunsigned int
, theunsigned int
shall be converted to along int
; otherwise both operands shall be converted tounsigned long int
.- Otherwise, if either operand is
long
, the other shall be converted tolong
.- Otherwise, if either operand is
unsigned
, the other shall be converted tounsigned
.
[Note: otherwise, the only remaining case is that both operands are
int
]
Therefore, since j
is unsigned
, i
is promoted to unsigned
and the addition is performed using unsigned int arithmetic.
b) This is UB. The result of the addition is unsigned int
(as per (a)), and thus you overflow the int
in the assignment.
c) From §4.5/1:
An rvalue of type
char
,signed char
,unsigned char
,short int
, orunsigned short int
can be converted to an rvalue of typeint
ifint
can represent all the values of the source type; otherwise, the source rvalue can be converted to an rvalue of typeunsigned int
.
Therefore, since a 4-byte int
can represent any value in a 2-byte short
or unsigned short
, both are promoted to int
(per §5.9's integral promotions rule), and then added as int
s.
d) From §3.9.1/4:
Unsigned integers, declared
unsigned
, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.
Therefore, UINT_MAX+1
is legal (not UB) and equal to 0.
Why is there a signedness issue when comparing uint16_t and unsigned int?
So what's going on, here ? Where does the int come from?
Integer promotion is going on here. On systems where std::uint16_t
is smaller than int
, it will be promoted to int
when used as an operand (of most binary operations).
In a - b
both operands are promoted to int
and the result is int
also. You compre this signed integer to 3u
which is unsigned int
. The signs differ, as the compiler warns you.
That warning doesn't show up if I write
if ( a - b < static_cast<uint16_t>(3u) )
instead.
Here, the right hand operand is also promoted to int
. Both sides of comparison are signed so there is no warning.
Can this actually result in an incorrect behavior?
if ( a - b < static_cast<uint16_t>(3u) )
does have different behaviour than a - b < static_cast<uint16_t>(3u)
. If one is correct, then presumably the other is incorrect.
Is there a less verbose way to silence it? (or a less verbose way to write a uint16_t literal?)
The correct solution depends on what behaviour you want to be correct.
P.S. You forgot to include the header that defines uint16_t
.
Implicit type promotion rules
C was designed to implicitly and silently change the integer types of the operands used in expressions. There exist several cases where the language forces the compiler to either change the operands to a larger type, or to change their signedness.
The rationale behind this is to prevent accidental overflows during arithmetic, but also to allow operands with different signedness to co-exist in the same expression.
Unfortunately, the rules for implicit type promotion cause much more harm than good, to the point where they might be one of the biggest flaws in the C language. These rules are often not even known by the average C programmer and therefore cause all manner of very subtle bugs.
Typically you see scenarios where the programmer says "just cast to type x and it works" - but they don't know why. Or such bugs manifest themselves as rare, intermittent phenomena striking from within seemingly simple and straight-forward code. Implicit promotion is particularly troublesome in code doing bit manipulations, since most bit-wise operators in C come with poorly-defined behavior when given a signed operand.
Integer types and conversion rank
The integer types in C are char
, short
, int
, long
, long long
and enum
._Bool
/bool
is also treated as an integer type when it comes to type promotions.
All integers have a specified conversion rank. C11 6.3.1.1, emphasis mine on the most important parts:
Every integer type has an integer conversion rank defined as follows:
— No two signed integer types shall have the same rank, even if they have the same representation.
— The rank of a signed integer type shall be greater than the rank of any signed integer type with less precision.
— The rank oflong long int
shall be greater than the rank oflong int
, which shall be greater than the rank ofint
, which shall be greater than the rank ofshort int
, which shall be greater than the rank ofsigned char
.
— The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.
— The rank of any standard integer type shall be greater than the rank of any extended integer type with the same width.
— The rank of char shall equal the rank of signed char and unsigned char.
— The rank of _Bool shall be less than the rank of all other standard integer types.
— The rank of any enumerated type shall equal the rank of the compatible integer type (see 6.7.2.2).
The types from stdint.h
sort in here too, with the same rank as whatever type they happen to correspond to on the given system. For example, int32_t
has the same rank as int
on a 32 bit system.
Further, C11 6.3.1.1 specifies which types are regarded as the small integer types (not a formal term):
The following may be used in an expression wherever an
int
orunsigned int
may
be used:
— An object or expression with an integer type (other than
int
orunsigned int
) whose integer conversion rank is less than or equal to the rank ofint
andunsigned int
.
What this somewhat cryptic text means in practice, is that _Bool
, char
and short
(and also int8_t
, uint8_t
etc) are the "small integer types". These are treated in special ways and subject to implicit promotion, as explained below.
The integer promotions
Whenever a small integer type is used in an expression, it is implicitly converted to int
which is always signed. This is known as the integer promotions or the integer promotion rule.
Formally, the rule says (C11 6.3.1.1):
If an
int
can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to anint
; otherwise, it is converted to anunsigned int
. These are called the integer promotions.
This means that all small integer types, no matter signedness, get implicitly converted to (signed) int
when used in most expressions.
This text is often misunderstood as: "all small signed integer types are converted to signed int and all small, unsigned integer types are converted to unsigned int". This is incorrect. The unsigned part here only means that if we have for example an unsigned short
operand, and int
happens to have the same size as short
on the given system, then the unsigned short
operand is converted to unsigned int
. As in, nothing of note really happens. But in case short
is a smaller type than int
, it is always converted to (signed) int
, regardless of it the short was signed or unsigned!
The harsh reality caused by the integer promotions means that almost no operation in C can be carried out on small types like char
or short
. Operations are always carried out on int
or larger types.
This might sound like nonsense, but luckily the compiler is allowed to optimize the code. For example, an expression containing two unsigned char
operands would get the operands promoted to int
and the operation carried out as int
. But the compiler is allowed to optimize the expression to actually get carried out as an 8-bit operation, as would be expected. However, here comes the problem: the compiler is not allowed to optimize out the implicit change of signedness caused by the integer promotion because there is no way for the compiler to tell if the programmer is purposely relying on implicit promotion to happen, or if it is unintentional.
This is why example 1 in the question fails. Both unsigned char operands are promoted to type int
, the operation is carried out on type int
, and the result of x - y
is of type int
. Meaning that we get -1
instead of 255
which might have been expected. The compiler may generate machine code that executes the code with 8 bit instructions instead of int
, but it may not optimize out the change of signedness. Meaning that we end up with a negative result, that in turn results in a weird number when printf("%u
is invoked. Example 1 could be fixed by casting the result of the operation back to type unsigned char
.
With the exception of a few special cases like ++
and sizeof
operators, the integer promotions apply to almost all operations in C, no matter if unary, binary (or ternary) operators are used.
The usual arithmetic conversions
Whenever a binary operation (an operation with 2 operands) is done in C, both operands of the operator have to be of the same type. Therefore, in case the operands are of different types, C enforces an implicit conversion of one operand to the type of the other operand. The rules for how this is done are named the usual artihmetic conversions (sometimes informally referred to as "balancing"). These are specified in C11 6.3.18:
(Think of this rule as a long, nested if-else if
statement and it might be easier to read :) )
6.3.1.8 Usual arithmetic conversions
Many operators that expect operands of arithmetic type cause conversions and yield result
types in a similar way. The purpose is to determine a common real type for the operands
and result. For the specified operands, each operand is converted, without change of type
domain, to a type whose corresponding real type is the common real type. Unless
explicitly stated otherwise, the common real type is also the corresponding real type of
the result, whose type domain is the type domain of the operands if they are the same,
and complex otherwise. This pattern is called the usual arithmetic conversions:
- First, if the corresponding real type of either operand is
long double
, the other operand is converted, without change of type domain, to a type whose corresponding real type islong double
.
- Otherwise, if the corresponding real type of either operand is
double
, the other operand is converted, without change of type domain, to a type whose corresponding real type isdouble
. - Otherwise, if the corresponding real type of either operand is
float
, the other operand is converted, without change of type domain, to a type whose corresponding real type is float. - Otherwise, the integer promotions are performed on both operands. Then the
following rules are applied to the promoted operands:
- If both operands have the same type, then no further conversion is needed.
- Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank. - Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type. - Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type. - Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
Notable here is that the usual arithmetic conversions apply to both floating point and integer variables. In the case of integers, we can also note that the integer promotions are invoked from within the usual arithmetic conversions. And after that, when both operands have at least the rank of int
, the operators are balanced to the same type, with the same signedness.
This is the reason why a + b
in example 2 gives a strange result. Both operands are integers and they are at least of rank int
, so the integer promotions do not apply. The operands are not of the same type - a
is unsigned int
and b
is signed int
. Therefore the operator b
is temporarily converted to type unsigned int
. During this conversion, it loses the sign information and ends up as a large value.
The reason why changing type to short
in example 3 fixes the problem, is because short
is a small integer type. Meaning that both operands are integer promoted to type int
which is signed. After integer promotion, both operands have the same type (int
), no further conversion is needed. And then the operation can be carried out on a signed type as expected.
unsigned to signed conversion, what happens at the bit level?
The bit pattern doesn't change at all (on most architectures you're likely to encounter in practice). The difference is in the instructions generated by the compiler to manipulate the values.
Integer Overflow and the difference between pow() and multiplication
Here (Linux 64bits, gcc 5.2.1), 55201
is an integer literal of size 4, and the expression 55201 * 55201
seems to be stored in an integer of size 4 before being assigned to your long long int
.
One option is storing the factor in another variable before multiplying, to increase the range.
int main(){
long long int x, factor;
factor = 55201;
x = factor * factor;
printf("%lld", x);
return 0;
}
Related Topics
How to Loop Through a C++ Map of Maps
Why Is Conversion from String Constant to 'Char*' Valid in C But Invalid in C++
Correct Way of Declaring Pointer Variables in C/C++
How to Hide a String in Binary Code
Implementing Comparison Operators Via 'Tuple' and 'Tie', a Good Idea
How to Take a Screenshot in a Windows Application
Create N-Element Constexpr Array in C++11
Using Char* as a Key in Std::Map
Using a C++ Class Member Function as a C Callback Function
Why Do C and C++ Support Memberwise Assignment of Arrays Within Structs, But Not Generally
Replace Substring With Another Substring C++
Problem Sorting Using Member Function as Comparator
Double Precision - Decimal Places
What Does It Mean to Have an Undefined Reference to a Static Member
Deoptimizing a Program For the Pipeline in Intel Sandybridge-Family Cpus