C++ Implicit Conversion (Signed + Unsigned)

C++ Implicit Conversion (Signed + Unsigned)

Relevant quote from the Standard:

5 Expressions [expr]

10 Many binary operators that expect operands of arithmetic or
enumeration type cause conversions and yield result types in a similar
way. The purpose is to yield a common type, which is also the type of
the result. This pattern is called the usual arithmetic conversions,
which are defined as follows:

[2 clauses about equal types or types of equal sign omitted]

— Otherwise, if the operand that has unsigned integer type has rank
greater than or equal to the rank of the type of the other operand,
the operand with signed integer type shall be converted to the type of
the operand with unsigned integer type.

— Otherwise, if the type of
the operand with signed integer type can represent all of the values
of the type of the operand with unsigned integer type, the operand
with unsigned integer type shall be converted to the type of the
operand with signed integer type.

— Otherwise, both operands shall be
converted to the unsigned integer type corresponding to the type of
the operand with signed integer type.

Let's consider the following 3 example cases for each of the 3 above clauses on a system where sizeof(int) < sizeof(long) == sizeof(long long) (easily adaptable to other cases)

#include <iostream>

signed int s1 = -4;
unsigned int u1 = 2;

signed long int s2 = -4;
unsigned int u2 = 2;

signed long long int s3 = -4;
unsigned long int u3 = 2;

int main()
{
std::cout << (s1 + u1) << "\n"; // 4294967294
std::cout << (s2 + u2) << "\n"; // -2
std::cout << (s3 + u3) << "\n"; // 18446744073709551614
}

Live example with output.

First clause: types of equal rank, so the signed int operand is converted to unsigned int. This entails a value-transformation which (using two's complement) gives te printed value.

Second clause: signed type has higher rank, and (on this platform!) can represent all values of the unsigned type, so unsigned operand is converted to signed type, and you get -2

Third clause: signed type again has higher rank, but (on this platform!) cannot represent all values of the unsigned type, so both operands are converted to unsigned long long, and after the value-transformation on the signed operand, you get the printed value.

Note that when the unsigned operand would be large enough (e.g. 6 in these examples), then the end result would give 2 for all 3 examples because of unsigned integer overflow.

(Added) Note that you get even more unexpected results when you do comparisons on these types. Lets consider the above example 1 with <:

#include <iostream>

signed int s1 = -4;
unsigned int u1 = 2;
int main()
{
std::cout << (s1 < u1 ? "s1 < u1" : "s1 !< u1") << "\n"; // "s1 !< u1"
std::cout << (-4 < 2u ? "-4 < 2u" : "-4 !< 2u") << "\n"; // "-4 !< 2u"
}

Since 2u is made unsigned explicitly by the u suffix the same rules apply. And the result is probably not what you expect when comparing -4 < 2 when writing in C++ -4 < 2u...

C integer implicit conversion

Since the constant 0xffffffff, which (assuming int is 32 bits) has type unsigned int, is being used to initialize an object of type int, this involves a conversion from unsigned int to int.

Conversion between integer types is described in section 6.3.1.3 of the C standard:

1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new
type, it is unchanged.

2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than
the maximum value that can be represented in the new type
until the value is in the range of the new type.

3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined
or an implementation-defined signal is raised

Paragraph 3 is what applies in this case. The value in question is outside the range of the destination type and the destination is signed. So an implementation-defined conversion happens.

If you compile with gcc using the -Wconversion flag, it will give you a warning:

x1.c:6:5: warning: conversion of unsigned constant value to negative integer [-Wsign-conversion]
int a = 0xffffffff;

Also:

This can be easily checked by doing printf("%s", 0xffffffff);

This invokes undefined behavior because the %s format specifier expects a char * which points to a null-terminated string. The value you're passing is not of this type, and likely isn't a valid memory address.

Integer promotions also don't apply here because there is no expression with a type of lower rank than int or unsigned int.

Getting warnings for implicit conversion overflow from unsigned to signed

The issue here is that -Wconversion is only going to warn if the value, when cast back to the source type, may not be the same type. For example if foo were to take an int instead then -Wconversion will issue a warning because it is possible that you can't cast the value in the int back to the original uint64_t value. If we have

uint64_t u = some_value;
int64_t s = static_cast<int64_t>(u);
uint64_t check = static_cast<uint64_t>(s)

then check == u will always be true (so long as int64_t is also two's compliment) so -Wconversion will not issue a warning because we get the source value back.

What you'll need in this case is

-Wsign-conversion

which will warning you that the signs mismatch.

C++ - Implicit conversion of unsigned long long to signed long long?

To show the actual types involved :

// Operator+ accepts difference type
// https://en.cppreference.com/w/cpp/iterator/move_iterator/operator_arith
// constexpr move_iterator operator+( difference_type n ) const;

#include <type_traits>
#include <vector>
#include <iterator>

int main()
{
std::vector<int> v1;

auto a = v1.begin();
auto d = v1.size();

// the difference type for the iterator is a "long long"
static_assert(std::is_same_v<long long, decltype(a)::difference_type>);

// the type for size is not the same as the difference type
static_assert(!std::is_same_v<decltype(d), decltype(a)::difference_type>);

// it is std::size_t
static_assert(std::is_same_v<decltype(d), std::size_t>);

return 0;
}

Signed to unsigned conversion in C - is it always safe?

Short Answer

Your i will be converted to an unsigned integer by adding UINT_MAX + 1, then the addition will be carried out with the unsigned values, resulting in a large result (depending on the values of u and i).

Long Answer

According to the C99 Standard:

6.3.1.8 Usual arithmetic conversions

  1. If both operands have the same type, then no further conversion is needed.
  2. Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
  3. Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
  4. Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
  5. Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

In your case, we have one unsigned int (u) and signed int (i). Referring to (3) above, since both operands have the same rank, your i will need to be converted to an unsigned integer.

6.3.1.3 Signed and unsigned integers

  1. When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
  2. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
  3. Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Now we need to refer to (2) above. Your i will be converted to an unsigned value by adding UINT_MAX + 1. So the result will depend on how UINT_MAX is defined on your implementation. It will be large, but it will not overflow, because:

6.2.5 (9)

A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

Bonus: Arithmetic Conversion Semi-WTF

#include <stdio.h>

int main(void)
{
unsigned int plus_one = 1;
int minus_one = -1;

if(plus_one < minus_one)
printf("1 < -1");
else
printf("boring");

return 0;
}

You can use this link to try this online: https://repl.it/repls/QuickWhimsicalBytes

Bonus: Arithmetic Conversion Side Effect

Arithmetic conversion rules can be used to get the value of UINT_MAX by initializing an unsigned value to -1, ie:

unsigned int umax = -1; // umax set to UINT_MAX

This is guaranteed to be portable regardless of the signed number representation of the system because of the conversion rules described above. See this SO question for more information: Is it safe to use -1 to set all bits to true?

c++ safeness of code with implicit conversion between signed and unsigned

In a code like the previous one, if I know for sure that n + x is positive, can I assume that the sum of unsigned int n and int x gives the expected value?

Yes.

First, the signed value converted to unsigned, using modulo arithmetic:

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n
where n is the number of bits used to represent the unsigned type).

Then two unsigned values will be added using modulo arithmetic:

Unsigned integers shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.

This means that you'll get the expected answer.

Even, if the result would be negative in the mathematical sense, the result in C++ would be a number which is modulo-equal to the negative number.

Note that I've supposed here that you add two same-sized integers.

Proper way to perform unsigned-signed conversion

I know that signed types overflow is undefined behaviour,

True, but does not apply here.

a += 140; is not signed integer overflow, not UB. That is like a = a + 140; a + 140 does not overflow when a is 8-bit signed char or unsigned char.

The issue is what happens when the sum a + 140 is out of char range and assigned to a char.

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised. C17dr § 6.3.1.3 3

It is implementation defined behavior, when char is signed and 8-bit - to assign a value outside the char range.

Usually the implementation defined behavior is a wrap and fully defined so a += 140; is fine as is.

Alternatively the implementation defined behavior might have been to cap the value to the char range when char is signed.

char a = 42;
a += 140;
// Might act as if
a = max(min(a + 140, CHAR_MAX), CHAR_MIN);
a = 127;

To avoid implementation defined behavior, perform the + or - on a accessed as a unsigned char

*((unsigned char *)&a) += small_offset;

Or just use unsigned char a and avoid all this. unsigned char is defined to wrap.

C++ implicit datatype conversion from unsigned to signed

There is no "signed integer literal": 5u - 10 is actually the subtraction of 10 from 5u.

The result (of the subtraction) is unsigned, and goes to overflow, giving as a result "5 numbers less than the overflown 0" (4294967291 = 232-5)

The first statement initialize an int, hence the unsigned compile time constant is reinterpreted as int. The result is correct (-5) because your hardware use 2s-complement arithmetic. (-5 and 4294967291 are the same 32 bit pattern)

The second statement initialize a variable whose type is inferred by the literal. And it is unsigned.



Related Topics



Leave a reply



Submit