Unsigned to Signed Conversion

Convert unsigned int to signed int C

It seems like you are expecting int and unsigned int to be a 16-bit integer. That's apparently not the case. Most likely, it's a 32-bit integer - which is large enough to avoid the wrap-around that you're expecting.

Note that there is no fully C-compliant way to do this because casting between signed/unsigned for values out of range is implementation-defined. But this will still work in most cases:

unsigned int x = 65529;
int y = (short) x; // If short is a 16-bit integer.

or alternatively:

unsigned int x = 65529;
int y = (int16_t) x; // This is defined in <stdint.h>

How does C casting unsigned to signed work?

No part of the C standard guarantees that your code shall print -1 in general. As it says, the result of the conversion is implementation-defined. However, the GCC documentation does promise that if you compile with their implementation, then your code will print -1. It's nothing to do with bit patterns, just math.

The clearly intended reading of "reduced modulo 2^N" in the GCC manual is that the result should be the unique number in the range of signed int that is congruent mod 2^N to the input. This is a precise mathematical way of defining the "wrapping" behavior that you expect, which happens to coincide with what you would get by reinterpreting the bits.

Assuming 32 bits, UINT_MAX has the value 4294967295. This is congruent mod 4294967296 to -1. That is, the difference between 4294967295 and -1 is a multiple of 4294967296, namely 4294967296 itself. Moreover, this is necessarily the unique such number in [-2147483648, 2147483647]. (Any other number congruent to -1 would be at least -1 + 4294967296 = 4294967295, or at most -1 - 4294967296 = -4294967297). So -1 is the result of the conversion.

In other words, add or subtract 4294967296 repeatedly until you get a number that's in the range of signed int. There's guaranteed to be exactly one such number, and in this case it's -1.

Unsigned to signed conversion in C

The conversion of a value to signed int is implementation-defined (as you correctly mentioned because of . On some systems for example it can be INT_MAX (saturating conversion).

For gcc the implementation behavior is defined here:

The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90, C99

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.


C++ unsigned and signed conversion


unsigned int u = 10;
int i = -3;

the evaluation of i + u proceeds by first converting i to unsigned int. For a 32-bit unsigned int, this conversion wraps modulo 232, which is 4,294,967,296. The result of this wrapping is −3 + 4,294,967,296 = 4,294,967,293.

After the conversion, we are adding 4,294,967,293 (converted i) and 10 (u). This would be 4,294,967,303. Since this exceeds the range of a 32-bit unsigned int, it is wrapped modulo 4,294,967,296. The result of this is 4,294,967,303 − 4,294,967,296 = 7.

Thus “7” is printed.

Proper way to perform unsigned-signed conversion

I know that signed types overflow is undefined behaviour,

True, but does not apply here.

a += 140; is not signed integer overflow, not UB. That is like a = a + 140; a + 140 does not overflow when a is 8-bit signed char or unsigned char.

The issue is what happens when the sum a + 140 is out of char range and assigned to a char.

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised. C17dr § 3

It is implementation defined behavior, when char is signed and 8-bit - to assign a value outside the char range.

Usually the implementation defined behavior is a wrap and fully defined so a += 140; is fine as is.

Alternatively the implementation defined behavior might have been to cap the value to the char range when char is signed.

char a = 42;
a += 140;
// Might act as if
a = max(min(a + 140, CHAR_MAX), CHAR_MIN);
a = 127;

To avoid implementation defined behavior, perform the + or - on a accessed as a unsigned char

*((unsigned char *)&a) += small_offset;

Or just use unsigned char a and avoid all this. unsigned char is defined to wrap.

Why is signed and unsigned addition converted differently for 16 and 32 bit integers?

When you do uint16_t(2)+int16_t(-3), both operands are types that are smaller than int. Because of this, each operand is promoted to an int and signed + signed results in a signed integer and you get the result of -1 stored in that signed integer.

When you do uint32_t(2)+int32_t(-3), since both operands are the size of an int or larger, no promotion happens and now you are in a case where you have unsigned + signed which results in a conversion of the signed integer into an unsigned integer, and the unsigned value of -1 wraps to being the largest value representable.

C++ Converting unsigned to signed integer portability

Since C++20 finally got rid of ones' complement and sign-magnitude integers, conversion between signed and unsigned integers is well-defined and reversible. All standard integer types are now 2's complement and conversion between signed and unsigned does not alter any bits in the representation.

For versions of C++ prior to C++20, the original answer still applies. I'm leaving it as a historical remnant.

Conversion of an unsigned integer to a signed integer where the unsigned value is outside of the range of the signed type is implementation-defined. You cannot count on being able to round-trip a negative integer to unsigned and then back to signed. [1]

C++ standard, [conv.integral], § 4.7/3:

If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

[1] It seems likely that it will work, but there are no guarantees.

Related Topics

Leave a reply
