Why must a short be converted to an int before arithmetic operations in C and C++?
If we look at the Rationale for International Standard—Programming Languages—C in section 6.3.1.8
Usual arithmetic conversions it says (emphasis mine going forward):
The rules in the Standard for these conversions are slight
modifications of those in K&R: the modifications accommodate the added
types and the value preserving rules. Explicit license was added to
perform calculations in a “wider” type than absolutely necessary,
since this can sometimes produce smaller and faster code, not to
mention the correct answer more often. Calculations can also be
performed in a “narrower” type by the as if rule so long as the same
end result is obtained. Explicit casting can always be used to obtain
a value in a desired type
Section 6.3.1.8 from the draft C99 standard covers the Usual arithmetic conversions which is applied to operands of arithmetic expressions for example section 6.5.6 Additive operators says:
If both operands have arithmetic type, the usual arithmetic
conversions are performed on them.
We find similar text in section 6.5.5 Multiplicative operators as well. In the case of a short operand, first the integer promotions are applied from section 6.3.1.1 Boolean, characters, and integers which says:
If an int can represent all values of the original type, the value is
converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions.48) All other types are
unchanged by the integer promotions.
The discussion from section 6.3.1.1
of the Rationale or International Standard—Programming Languages—C on integer promotions is actually more interesting, I am going to selectively quote b/c it is too long to fully quote:
Implementations fell into two major camps which may be characterized
as unsigned preserving and value preserving.[...]
The unsigned preserving approach calls for promoting the two smaller
unsigned types to unsigned int. This is a simple rule, and yields a
type which is independent of execution environment.The value preserving approach calls for promoting those types to
signed int if that type can properly represent all the values of the
original type, and otherwise for promoting those types to unsigned
int. Thus, if the execution environment represents short as something
smaller than int, unsigned short becomes int; otherwise it becomes
unsigned int.
This can have some rather unexpected results in some cases as Inconsistent behaviour of implicit conversion between unsigned and bigger signed types demonstrates, there are plenty more examples like that. Although in most cases this results in the operations working as expected.
C uses different data type for arithmetic in the middle of an expression?
Welcome to integer promotions! One behavior of the C language (an often criticized one, I'd add) is that types like char
and short
are promoted to int
before doing any arithmetic operation with them, and the result is also int
. What does this mean?
unsigned char foo(unsigned char x) {
return (x << 4) >> 4;
}
int main(void) {
if (foo(0xFF) == 0x0F) {
printf("Yay!\n");
}
else {
printf("... hey, wait a minute!\n");
}
return 0;
}
Needless to say, the above code prints ... hey, wait a minute!
. Let's discover why:
// this line of code:
return (x << 4) >> 4;
// is converted to this (because of integer promotion):
return ((int) x << 4) >> 4;
Therefore, this is what happens:
x
isunsigned char
(8-bit) and its value is0xFF
,x << 4
needs to be executed, but firstx
is converted toint
(32-bit),x << 4
becomes0x000000FF << 4
, and the result0x00000FF0
is alsoint
,0x00000FF0 >> 4
is executed, yielding0x000000FF
,- finally,
0x000000FF
is converted tounsigned char
(because that's the return value offoo()
), so it becomes0xFF
, - and that's why
foo(0xFF)
yields0xFF
instead of0x0F
.
How to prevent this? Simple: convert the result of x << 4
to unsigned char
. In the previous example, 0x00000FF0
would have become 0xF0
.
unsigned char foo(unsigned char x) {
return ((unsigned char) (x << 4)) >> 4;
}
foo(0xFF) == 0x0F
NOTE: in the previous examples, it is assumed that unsigned char
is 8 bits and int
is 32 bits, but the examples work for basically any situation in which CHAR_BIT == 8
(because C17 requires that sizeof(int) * CHAR_BIT >= 16
).
P.S.: this answer is not as exhaustive as the C official standard document, of course. But you can find all the (valid and defined) behavior of C described in the latest draft of the ISO/IEC 9899:2018 standard (a.k.a. C17/C18).
Do arithmetic operators have to promote integral arguments to int? [duplicate]
You were one line away from it. From [expr]/11 (N4659):
Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:
...
Otherwise, the integral promotions (7.6) shall be performed on both operands. Then the following rules shall be applied to the promoted operands:
Emphasis added. [conv.prom] says that they can take place and how they work. [expr]/11 specifies one of the times when they will take place.
What does 'Natural Size' really mean in C++?
the 'natural size' is the width of integer that is processed most efficiently by a particular hardware.
Not really. Consider the x64 architecture. Arithmetic on any size from 8 to 64 bits will be essentially the same speed. So why have all x64 compilers settled on a 32-bit int
? Well, because there was a lot of code out there which was originally written for 32-bit processors, and a lot of it implicitly relied on ints being 32-bits. And given the near-uselessness of a type which can represent values up to nine quintillion, the extra four bytes per integer would have been virtually unused. So we've decided that 32-bit ints are "natural" for this 64-bit platform.
Compare the 80286 architecture. Only 16 bits in a register. Performing 32-bit integer addition on such a platform basically requires splitting it into two 16-bit additions. Doing virtually anything with it involves splitting it up, really-- and an attendant slowdown. The 80286's "natural integer size" is most definitely not 32 bits.
So really, "natural" comes down to considerations like processing efficiency, memory usage, and programmer-friendliness. It is not an acid test. It is very much a matter of subjective judgment on the part of the architecture/compiler designer.
c = a + b and implicit conversion
First, you should know that in C the standard types do not have a specific precision (number of representable values) for the standard integer types. It only requires a minimal precision for each type. These result in the following typical bit sizes, the standard allows for more complex representations:
char
: 8 bitsshort
: 16 bitsint
: 16 (!) bitslong
: 32 bitslong long
(since C99): 64 bits
Note: The actual limits (which imply a certain precision) of an implementation are given in limits.h
.
Second, the type an operation is performed is determined by the types of the operands, not the type of the left side of an assignment (becaus assignments are also just expressions). For this the types given above are sorted by conversion rank. Operands with smaller rank than int
are converted to int
first. For other operands, the one with smaller rank is converted to the type of the other operand. These are the usual arithmetic conversions.
Your implementation seems to use 16 bit unsigned int
with the same size as unsigned short
, so a
and b
are converted to unsigned int
, the operation is performed with 16 bit. For unsigned
, the operation is performed modulo 65536 (2 to the power of 16) - this is called wrap-around (this is not required for signed types!). The result is then converted to unsigned long
and assigned to the variables.
For gcc, I assume this compiles for a PC or a 32 bit CPU. for this(unsigned) int
has typically 32 bits, while (unsigned) long
has at least 32 bits (required). So, there is no wrap around for the operations.
Note: For the PC, the operands are converted to int
, not unsigned int
. This because int
can already represent all values of unsigned short
; unsigned int
is not required. This can result in unexpected (actually: implementation defined) behaviour if the result of the operation overflows an signed int
!
If you need types of defined size, see stdint.h
(since C99) for uint16_t
, uint32_t
. These are typedef
s to types with the appropriate size for your implementation.
You can also cast one of the operands (not the whole expression!) to the type of the result:
unsigned long c = (unsigned long)a + b;
or, using types of known size:
#include <stdint.h>
...
uint16_t a = 60000, b = 60000;
uint32_t c = (uint32_t)a + b;
Note that due to the conversion rules, casting one operand is sufficient.
Update (thanks to @chux):
The cast shown above works without problems. However, if a
has a larger conversion rank than the typecast, this might truncate its value to the smaller type. While this can be easily avoided as all types are known at compile-time (static typing), an alternative is to multiply with 1 of the wanted type:
unsigned long c = ((unsigned long)1U * a) + b
This way the larger rank of the type given in the cast or a
(or b
) is used. The multiplication will be eliminated by any reasonable compiler.
Another approach, avoiding to even know the target type name can be done with the typeof()
gcc extension:
unsigned long c;
... many lines of code
c = ((typeof(c))1U * a) + b
Why are integer types promoted during addition in C?
So it appears that the result of
numberA + 1
was promoted touint32_t
The operands of the addition were promoted to int
before the addition took place, and the result of the addition is of the same type as the effective operands (int
).
Indeed, if int
is 32-bit wide on your compilation platform (meaning that the type that represents uint16_t
has lower “conversion rank” than int
), then numberA + 1
is computed as an int
addition between 1
and a promoted numberA
as part of the integer promotion rules, 6.3.1.1:2 in the C11 standard:
The following may be used in an expression wherever an int or unsigned int may be used: […] An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
[…]
If an int can represent all values of the original type […], the value is converted to an int
In your case, unsigned short
which is in all likelihood what uint16_t
is defined as on your platform, has all its values representable as elements of int
, so the unsigned short
value numberA
gets promoted to int
when it occurs in an arithmetic operation.
Why auto is deduced to int instead of uint16_t
Addition will perform the usual arithmetic conversions on its operands which in this case will result in the operands being promoted to int due the the integer promotions and the result will also be int.
You can use uint16_t instead of auto to force a conversion back or in the general case you can use static_cast
.
For a rationale as to why type smaller than int are promoted to larger types see Why must a short be converted to an int before arithmetic operations in C and C++?.
For reference, from the draft C++ standard section 5.7
Additive operators:
[...]The usual arithmetic conversions are performed for operands of
arithmetic or enumeration type[...]
and from section 5
Expressions:
[...]Otherwise, the integral promotions (4.5) shall be performed on
both operands.59 Then the following rules shall be applied
to the promoted operands[...]
and from section 4.5
Integral promotions (emphasis mine):
A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank
of int can be converted to a prvalue of type int if int can represent
all the values of the source type; otherwise, the source prvalue can
be converted to a prvalue of type unsigned int.
Assuming int is larger than 16-bit.
Related Topics
How to Get Enum Item Name from Its Value
Qt Delete Selected Row in Qtableview
Two Dimensional Array With Random Numbers Doesn't Change
Print a Binary Tree in a Pretty Way
Opencv Imread(Filename) Fails in Debug Mode When Using Release Libraries
How to Get a Stack Trace for C++ Using Gcc With Line Number Information
Automatically Refreshing a Qtableview When Data Changed
What Does the Single Ampersand After the Parameter List of a Member Function Declaration Mean
Why Is "Using Namespace Std;" Considered Bad Practice
Why Should I Not #Include ≪Bits/Stdc++.H≫
How to Expand a Tuple into Variadic Template Function'S Arguments
C++ Preprocessor _Va_Args_ Number of Arguments
Explicit Template Instantiation - When Is It Used