Why Must a Short Be Converted to an Int Before Arithmetic Operations in C and C++

Why must a short be converted to an int before arithmetic operations in C and C++?

If we look at the Rationale for International Standard—Programming Languages—C in section 6.3.1.8 Usual arithmetic conversions it says (emphasis mine going forward):

The rules in the Standard for these conversions are slight
modifications of those in K&R: the modifications accommodate the added
types and the value preserving rules. Explicit license was added to
perform calculations in a “wider” type than absolutely necessary,
since this can sometimes produce smaller and faster code, not to
mention the correct answer more often
. Calculations can also be
performed in a “narrower” type by the as if rule so long as the same
end result is obtained. Explicit casting can always be used to obtain
a value in a desired type

Section 6.3.1.8 from the draft C99 standard covers the Usual arithmetic conversions which is applied to operands of arithmetic expressions for example section 6.5.6 Additive operators says:

If both operands have arithmetic type, the usual arithmetic
conversions
are performed on them.

We find similar text in section 6.5.5 Multiplicative operators as well. In the case of a short operand, first the integer promotions are applied from section 6.3.1.1 Boolean, characters, and integers which says:

If an int can represent all values of the original type, the value is
converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions.48) All other types are
unchanged by the integer promotions.

The discussion from section 6.3.1.1 of the Rationale or International Standard—Programming Languages—C on integer promotions is actually more interesting, I am going to selectively quote b/c it is too long to fully quote:

Implementations fell into two major camps which may be characterized
as unsigned preserving and value preserving.

[...]

The unsigned preserving approach calls for promoting the two smaller
unsigned types to unsigned int. This is a simple rule, and yields a
type which is independent of execution environment.

The value preserving approach calls for promoting those types to
signed int if that type can properly represent all the values of the
original type, and otherwise for promoting those types to unsigned
int. Thus, if the execution environment represents short as something
smaller than int, unsigned short becomes int; otherwise it becomes
unsigned int.

This can have some rather unexpected results in some cases as Inconsistent behaviour of implicit conversion between unsigned and bigger signed types demonstrates, there are plenty more examples like that. Although in most cases this results in the operations working as expected.

C uses different data type for arithmetic in the middle of an expression?

Welcome to integer promotions! One behavior of the C language (an often criticized one, I'd add) is that types like char and short are promoted to int before doing any arithmetic operation with them, and the result is also int. What does this mean?

unsigned char foo(unsigned char x) {
return (x << 4) >> 4;
}

int main(void) {
if (foo(0xFF) == 0x0F) {
printf("Yay!\n");
}
else {
printf("... hey, wait a minute!\n");
}

return 0;
}

Needless to say, the above code prints ... hey, wait a minute!. Let's discover why:

// this line of code:
return (x << 4) >> 4;

// is converted to this (because of integer promotion):
return ((int) x << 4) >> 4;

Therefore, this is what happens:

  • x is unsigned char (8-bit) and its value is 0xFF,
  • x << 4 needs to be executed, but first x is converted to int (32-bit),
  • x << 4 becomes 0x000000FF << 4, and the result 0x00000FF0 is also int,
  • 0x00000FF0 >> 4 is executed, yielding 0x000000FF,
  • finally, 0x000000FF is converted to unsigned char (because that's the return value of foo()), so it becomes 0xFF,
  • and that's why foo(0xFF) yields 0xFF instead of 0x0F.

How to prevent this? Simple: convert the result of x << 4 to unsigned char. In the previous example, 0x00000FF0 would have become 0xF0.

unsigned char foo(unsigned char x) {
return ((unsigned char) (x << 4)) >> 4;
}

foo(0xFF) == 0x0F

NOTE: in the previous examples, it is assumed that unsigned char is 8 bits and int is 32 bits, but the examples work for basically any situation in which CHAR_BIT == 8 (because C17 requires that sizeof(int) * CHAR_BIT >= 16).

P.S.: this answer is not as exhaustive as the C official standard document, of course. But you can find all the (valid and defined) behavior of C described in the latest draft of the ISO/IEC 9899:2018 standard (a.k.a. C17/C18).

Do arithmetic operators have to promote integral arguments to int? [duplicate]

You were one line away from it. From [expr]/11 (N4659):

Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:

...

Otherwise, the integral promotions (7.6) shall be performed on both operands. Then the following rules shall be applied to the promoted operands:

Emphasis added. [conv.prom] says that they can take place and how they work. [expr]/11 specifies one of the times when they will take place.

What does 'Natural Size' really mean in C++?

the 'natural size' is the width of integer that is processed most efficiently by a particular hardware.

Not really. Consider the x64 architecture. Arithmetic on any size from 8 to 64 bits will be essentially the same speed. So why have all x64 compilers settled on a 32-bit int? Well, because there was a lot of code out there which was originally written for 32-bit processors, and a lot of it implicitly relied on ints being 32-bits. And given the near-uselessness of a type which can represent values up to nine quintillion, the extra four bytes per integer would have been virtually unused. So we've decided that 32-bit ints are "natural" for this 64-bit platform.

Compare the 80286 architecture. Only 16 bits in a register. Performing 32-bit integer addition on such a platform basically requires splitting it into two 16-bit additions. Doing virtually anything with it involves splitting it up, really-- and an attendant slowdown. The 80286's "natural integer size" is most definitely not 32 bits.

So really, "natural" comes down to considerations like processing efficiency, memory usage, and programmer-friendliness. It is not an acid test. It is very much a matter of subjective judgment on the part of the architecture/compiler designer.

c = a + b and implicit conversion

First, you should know that in C the standard types do not have a specific precision (number of representable values) for the standard integer types. It only requires a minimal precision for each type. These result in the following typical bit sizes, the standard allows for more complex representations:

  • char: 8 bits
  • short: 16 bits
  • int: 16 (!) bits
  • long: 32 bits
  • long long (since C99): 64 bits

Note: The actual limits (which imply a certain precision) of an implementation are given in limits.h.

Second, the type an operation is performed is determined by the types of the operands, not the type of the left side of an assignment (becaus assignments are also just expressions). For this the types given above are sorted by conversion rank. Operands with smaller rank than int are converted to int first. For other operands, the one with smaller rank is converted to the type of the other operand. These are the usual arithmetic conversions.

Your implementation seems to use 16 bit unsigned int with the same size as unsigned short, so a and b are converted to unsigned int, the operation is performed with 16 bit. For unsigned, the operation is performed modulo 65536 (2 to the power of 16) - this is called wrap-around (this is not required for signed types!). The result is then converted to unsigned long and assigned to the variables.

For gcc, I assume this compiles for a PC or a 32 bit CPU. for this(unsigned) int has typically 32 bits, while (unsigned) long has at least 32 bits (required). So, there is no wrap around for the operations.

Note: For the PC, the operands are converted to int, not unsigned int. This because int can already represent all values of unsigned short; unsigned int is not required. This can result in unexpected (actually: implementation defined) behaviour if the result of the operation overflows an signed int!

If you need types of defined size, see stdint.h (since C99) for uint16_t, uint32_t. These are typedefs to types with the appropriate size for your implementation.

You can also cast one of the operands (not the whole expression!) to the type of the result:

unsigned long c = (unsigned long)a + b;

or, using types of known size:

#include <stdint.h>
...
uint16_t a = 60000, b = 60000;
uint32_t c = (uint32_t)a + b;

Note that due to the conversion rules, casting one operand is sufficient.

Update (thanks to @chux):

The cast shown above works without problems. However, if a has a larger conversion rank than the typecast, this might truncate its value to the smaller type. While this can be easily avoided as all types are known at compile-time (static typing), an alternative is to multiply with 1 of the wanted type:

unsigned long c = ((unsigned long)1U * a) + b

This way the larger rank of the type given in the cast or a (or b) is used. The multiplication will be eliminated by any reasonable compiler.

Another approach, avoiding to even know the target type name can be done with the typeof() gcc extension:

unsigned long c;

... many lines of code

c = ((typeof(c))1U * a) + b

Why are integer types promoted during addition in C?

So it appears that the result of numberA + 1 was promoted to uint32_t

The operands of the addition were promoted to int before the addition took place, and the result of the addition is of the same type as the effective operands (int).

Indeed, if int is 32-bit wide on your compilation platform (meaning that the type that represents uint16_t has lower “conversion rank” than int), then numberA + 1 is computed as an int addition between 1 and a promoted numberA as part of the integer promotion rules, 6.3.1.1:2 in the C11 standard:

The following may be used in an expression wherever an int or unsigned int may be used: […] An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.

[…]

If an int can represent all values of the original type […], the value is converted to an int

In your case, unsigned short which is in all likelihood what uint16_t is defined as on your platform, has all its values representable as elements of int, so the unsigned short value numberA gets promoted to int when it occurs in an arithmetic operation.

Why auto is deduced to int instead of uint16_t

Addition will perform the usual arithmetic conversions on its operands which in this case will result in the operands being promoted to int due the the integer promotions and the result will also be int.

You can use uint16_t instead of auto to force a conversion back or in the general case you can use static_cast.

For a rationale as to why type smaller than int are promoted to larger types see Why must a short be converted to an int before arithmetic operations in C and C++?.

For reference, from the draft C++ standard section 5.7 Additive operators:

[...]The usual arithmetic conversions are performed for operands of
arithmetic or enumeration type[...]

and from section 5 Expressions:

[...]Otherwise, the integral promotions (4.5) shall be performed on
both operands.59 Then the following rules shall be applied
to the promoted operands[...]

and from section 4.5 Integral promotions (emphasis mine):

A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank
of int can be converted to a prvalue of type int if int can represent
all the values of the source type
; otherwise, the source prvalue can
be converted to a prvalue of type unsigned int.

Assuming int is larger than 16-bit.



Related Topics



Leave a reply



Submit