Why Unsigned Int 0Xffffffff Is Equal to Int -1

Why unsigned int 0xFFFFFFFF is equal to int -1?

C and C++ can run on many different architectures, and machine types. Consequently, they can have different representations of numbers: Two's complement, and Ones' complement being the most common. In general you should not rely on a particular representation in your program.

For unsigned integer types (size_t being one of those), the C standard (and the C++ standard too, I think) specifies precise overflow rules. In short, if SIZE_MAX is the maximum value of the type size_t, then the expression

(size_t) (SIZE_MAX + 1)

is guaranteed to be 0, and therefore, you can be sure that (size_t) -1 is equal to SIZE_MAX. The same holds true for other unsigned types.

Note that the above holds true:

  • for all unsigned types,
  • even if the underlying machine doesn't represent numbers in Two's complement. In this case, the compiler has to make sure the identity holds true.

Also, the above means that you can't rely on specific representations for signed types.

Edit: In order to answer some of the comments:

Let's say we have a code snippet like:

int i = -1;
long j = i;

There is a type conversion in the assignment to j. Assuming that int and long have different sizes (most [all?] 64-bit systems), the bit-patterns at memory locations for i and j are going to be different, because they have different sizes. The compiler makes sure that the values of i and j are -1.

Similarly, when we do:

size_t s = (size_t) -1

There is a type conversion going on. The -1 is of type int. It has a bit-pattern, but that is irrelevant for this example because when the conversion to size_t takes place due to the cast, the compiler will translate the value according to the rules for the type (size_t in this case). Thus, even if int and size_t have different sizes, the standard guarantees that the value stored in s above will be the maximum value that size_t can take.

If we do:

long j = LONG_MAX;
int i = j;

If LONG_MAX is greater than INT_MAX, then the value in i is implementation-defined (C89, section 3.2.1.2).

Why is an int variable valued `0xffffffff` 1 != 0x7fffffff?

Because 0xffffffff on your platform is a value of type unsigned int.
Therefore 0xffffffff >> 1 equals 0xffffffffU >> 1 which is 0x7fffffffU.

When you assign a value of type unsigned int to a variable of type int the value is converted to the target type. 0xffffffff i.e. 4294967295 is outside the range of a 32-bit signed integer, so naturally it cannot be stored as is. But because your computer uses two's complement system, the bits will remain the same, i.e. all bits will be set, but the value is interpreted as -1 instead of 4294967295.

The behaviour of right shift on negative integers is implementation-defined in the C standard. GCC defines it as arithmetic right shift, i.e. using sign extension. So the bit representation ffffffff shifted right by 1 will result in ffffffff bit representation; which as signed int signifies value -1.

Finally when you compare an int against an unsigned int using the == operator, the operands will undergo "usual arithmetic conversions" - the int -1 will be converted to an unsigned int - which on two's complement systems retains the bit pattern i.e. ffffffff, but now the value will be interpreted as 4294967295 instead of -1. But that value compares unequal to 0x7fffffff (2147483647), which has value


What is more confusing is that the value of c does not equal the 0xffffffff even though the equality operator returns true! The value of c will be -1, and the value of 0xffffffff is 0xffffffffu i.e. 4294967295 and clearly the difference between -1 and 4294967295 is 4294967296! But it is exactly because of the signed/unsigned comparison that different values can compare to equal - and for example GCC will warn about this with -Wextra:

warning: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Wsign-compare]
5 | printf("%x\n", (c) == (0xffffffff));
| ^~
[venv][vrcappsdev][master✗]

To see that the values indeed are distinct, you need to cast both sides to a type that is large enough to contain both of them - long long int is a portable one:

printf("%d\n", (long long int)c == (long long int)0xffffffff);

will not print 1.

What is the right way to assign 0xFFFFFFFF to an unsigned integer variable?

size_t a = -1;

This will initialize a with the biggest value size_t can hold. This is defined in terms of modulo arithmetics and not bit patterns. So this is true regardless if signed integers use 2s complement or something else.

Unsigned integers are required to be encoded directly as their binary representation so the largest value will always have the 0xFF...FF bit pattern.

To silence the cast both of your solution work. It's just a matter of personal taste which one you use.

Why do C compilers not warn when assigning integer value too high for signed type?

Assume that int and unsigned int are 32 bits, which is the case on most platforms you're likely to be using (both 32-bit and 64-bit systems). Then the constant 0xFFFFFFFF is of type unsigned int, and has the value 4294967295.

This:

int n = 0xFFFFFFFF;

implicitly converts that value from unsigned int to int. The result of the conversion is implementation-defined; there is no undefined behavior. (In principle, it can also cause an implementation-defined signal to be raised, but I know of no implementations that do that).

Most likely the value stored in n will be -1.

printf("%u\n", n);

Here you use a %u format specifier, which requires an argument of type unsigned int, but you pass it an argument of type int. The standard says that values of corresponding signed and unsigned type are interchangeable as function arguments, but only for values that are within the range of both types, which is not the case here.

This call does not perform a conversion from int to unsigned int. Rather, an int value is passed to printf, which assumes that the value it received is of type unsigned int. The behavior is undefined. (Again, this would be a reasonable thing to warn about.)

The most likely result is that the int value of -1, which (assuming 2's-complement) has the same representation as 0xFFFFFFFF, will be treated as if it were an unsigned int value of 0xFFFFFFFF, which is printed in decimal as 4294967295.

You can get a warning on int n = 0xFFFFFFFF; by using the -Wconversion or -Wsign-conversion option. These option are not included in -Wextra or -Wall. (You'd have to ask the gcc maintainers why.)

I don't know of an option that will cause a warning on the printf call.

(Of course the fix is to define n as an unsigned int, which makes everything correct and consistent.)

Dealing with unsigned integers

A lot of those "you shouldn't use unsigned integers" are just basically scared that you will mix up signed integers and unsigned ones, causing a wrap-around, or to avoid complex integer promotion rules.

But in your code, I see no reason not to use uint32_t and std::size_t, because m_X_AxisLen and m_Y_AxisLen should not contain negative values, and using uint32_t and std::size_t makes a lot more sense here:

So, I suggest changing m_X_AxisLen and m_Y_AxisLen to:

std::size_t m_Y_AxisLen;
std::size_t m_X_AxisLen; // for consistency

Change row and column to

std::size_t row = 0;
// and
std::size_t column = 0;

Make getX_AxisLen( ) returns an std::size_t

And make the for loop:

for ( int column = 0; column < getX_AxisLen( ) - 1; ++column )

to:

for ( int column = 0; column + 1 < getX_AxisLen( ); ++column )

Because if getX_AxisLen() returns 0, getX_AxisLen( ) - 1 will cause a wrap-around.

Basically, use the thing that makes sense. If a value cannot be negative, use the unsigned type.

Why unsigned int stills signed?

This is undefined behavior when you pass unsigned integers to %d. Wrong format specifier is UB.

If you assign a negative value to an unsigned variable, it's fine and the value will be taken modulo UINT_MAX + 1 (or UCHAR_MAX + 1), so (-10) % (UCHAR_MAX + 1) = 256 - 10 = 246, and b is 4294967296 - 10 = 4294967286. Unsigned integral overflow is required to wrap-around.

When printf is interpreting these numbers, it finds 246 is suitable for %d, the format specifier for signed int, and 4294967286 is reinterpreted as -10. That's all.

Signed vs Unsigned numbers 0xFFFFFFFF ambiguity?


I am really confused! How does the machine make the difference between at low level , after all those are 32 bits all turned on.

Machine doesn't; the compiler does. It is the compiler that knows the type of signed and unsigned variables (ignoring for a moment the fact that both signed and unsigned are keywords in C and C++). Therefore, the compiler knows what instructions to generate and what functions to call based on these types.

The distinction between the types is made at compile time, changing the interpretation of possibly identical raw data based on its compile-time type.

Consider an example of printing the value of a variable. When you write

cout << mySignedVariable << " " << myUnsignedVariable << endl;

the compiler sees two overloads of << operator being applied:

  • The first << makes a call to ostream& operator<< (int val);
  • The second << makes a call to ostream& operator<< (unsigned int val);

Once the code reaches the proper operator implementation, you are done: the code generated for the specific overload has the information on how to handle the signed or unsigned value "baked into" its machine code. The implementation that takes an int uses signed machine instructions for comparisons, divisions, multiplications, etc. while the unsigned implementation uses different instructions, producing the desired results.



Related Topics



Leave a reply



Submit