Allowing Signed Integer Overflows in C/C++

Is signed integer overflow still undefined behavior in C++?

is still overflow of these types an undefined behavior?

Yes. Per Paragraph 5/4 of the C++11 Standard (regarding any expression in general):

If during the evaluation of an expression, the result is not mathematically defined or not in the range of
representable values for its type, the behavior is undefined. [...]

The fact that a two's complement representation is used for those signed types does not mean that arithmetic modulo 2^n is used when evaluating expressions of those types.

Concerning unsigned arithmetic, on the other hand, the Standard explicitly specifies that (Paragraph 3.9.1/4):

Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2^n where n is the number
of bits in the value representation of that particular size of integer

This means that the result of an unsigned arithmetic operation is always "mathematically defined", and the result is always within the representable range; therefore, 5/4 does not apply. Footnote 46 explains this:

46) This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting
unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the
resulting unsigned integer type.

Is signed integer overflow undefined behaviour or implementation defined?

Both references are correct, but they do not address the same issue.

int a = UINT_MAX; is not an instance of signed integer overflow, this definition involves a conversion from unsigned int to int with a value that exceeds the range of type int. As quoted from the École polytechnique's site, the C Standard defines the behavior as implementation-defined.

#include <limits.h>

int main(){
int a = UINT_MAX; // implementation defined behavior
int b = INT_MAX + 1; // undefined behavior
return 0;
}

Here is the text from the C Standard:

6.3.1.3 Signed and unsigned integers

  1. When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

  2. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

  3. Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Some compilers have a command line option to change the behavior of signed arithmetic overflow from undefined behavior to implementation-defined: gcc and clang support -fwrapv to force integer computations to be performed modulo the 232 or 264 depending on the signed type. This prevents some useful optimisations, but also prevents some counterintuitive optimisations that may break innocent looking code. See this question for some examples: What does -fwrapv do?

Signed Integer value overflow in C++?

Because signed overflow/underflow are classified as undefined behavior, compilers are allowed to cheat and assume it can't happen (this came up during a Cppcon talk a year or two ago, but I forget the talk off the top of my head). Because you're doing the arithmetic and then checking the result, the optimizer gets to optimize away part of the check.

This is untested code, but you probably want something like the following:

if(b != 0) {
auto max_a = std::numeric_limits<int64_t>::max() / b;
if(max_a < a) {
throw std::runtime_error{"overflow"};
}
}
return a * b;

Note that this code doesn't handle underflow; if a * b can be negative, this check won't work.

Per Godbolt, you can see your version has the check completely optimized away.

How to detect integer overflow in C

You can predict signed int overflow but attempting to detect it after the summation is too late. You have to test for possible overflow before you do a signed addition.

It's not possible to avoid undefined behaviour by testing for it after the summation. If the addition overflows then there is already undefined behaviour.

If it were me, I'd do something like this:

#include <limits.h>

int safe_add(int a, int b)
{
if (a >= 0) {
if (b > (INT_MAX - a)) {
/* handle overflow */
}
} else {
if (b < (INT_MIN - a)) {
/* handle underflow */
}
}
return a + b;
}

Refer this paper for more information. You can also find why unsigned integer overflow is not undefined behaviour and what could be portability issues in the same paper.

EDIT:

GCC and other compilers have some provisions to detect the overflow. For example, GCC has following built-in functions allow performing simple arithmetic operations together with checking whether the operations overflowed.

bool __builtin_add_overflow (type1 a, type2 b, type3 *res)
bool __builtin_sadd_overflow (int a, int b, int *res)
bool __builtin_saddl_overflow (long int a, long int b, long int *res)
bool __builtin_saddll_overflow (long long int a, long long int b, long long int *res)
bool __builtin_uadd_overflow (unsigned int a, unsigned int b, unsigned int *res)
bool __builtin_uaddl_overflow (unsigned long int a, unsigned long int b, unsigned long int *res)
bool __builtin_uaddll_overflow (unsigned long long int a, unsigned long long int b, unsigned long long int *res)

Visit this link.

EDIT:

Regarding the question asked by someone

I think, it would be nice and informative to explain why signed int overflow undefined, whereas unsigned apperantly isn't..

The answer depends upon the implementation of the compiler. Most C implementations (compilers) just used whatever overflow behaviour was easiest to implement with the integer representation it used.

In practice, the representations for signed values may differ (according to the implementation): one's complement, two's complement, sign-magnitude. For an unsigned type there is no reason for the standard to allow variation because there is only one obvious binary representation (the standard only allows binary representation).

Detecting signed overflow in C/C++

Your approach with subtraction is correct and well-defined. A compiler cannot optimize it away.

Another correct approach, if you have a larger integer type available, is to perform the arithmetic in the larger type and then check that the result fits in the smaller type when converting it back

int sum(int a, int b)
{
long long c;
assert(LLONG_MAX>INT_MAX);
c = (long long)a + b;
if (c < INT_MIN || c > INT_MAX) abort();
return c;
}

A good compiler should convert the entire addition and if statement into an int-sized addition and a single conditional jump-on-overflow and never actually perform the larger addition.

Edit: As Stephen pointed out, I'm having trouble getting a (not-so-good) compiler, gcc, to generate the sane asm. The code it generates is not terribly slow, but certainly suboptimal. If anyone knows variants on this code that will get gcc to do the right thing, I'd love to see them.

Why is unsigned integer overflow defined behavior but signed integer overflow isn't?

The historical reason is that most C implementations (compilers) just used whatever overflow behaviour was easiest to implement with the integer representation it used. C implementations usually used the same representation used by the CPU - so the overflow behavior followed from the integer representation used by the CPU.

In practice, it is only the representations for signed values that may differ according to the implementation: one's complement, two's complement, sign-magnitude. For an unsigned type there is no reason for the standard to allow variation because there is only one obvious binary representation (the standard only allows binary representation).

Relevant quotes:

C99 6.2.6.1:3:

Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.

C99 6.2.6.2:2:

If the sign bit is one, the value shall be modified in one of the following ways:

— the corresponding value with sign bit 0 is negated (sign and magnitude);

— the sign bit has the value −(2N) (two’s complement);

— the sign bit has the value −(2N − 1) (one’s complement).


Nowadays, all processors use two's complement representation, but signed arithmetic overflow remains undefined and compiler makers want it to remain undefined because they use this undefinedness to help with optimization. See for instance this blog post by Ian Lance Taylor or this complaint by Agner Fog, and the answers to his bug report.

C/C++ unsigned integer overflow

It means the value "wraps around".

UINT_MAX + 1 == 0
UINT_MAX + 2 == 1
UINT_MAX + 3 == 2

.. and so on

As the link says, this is like the modulo operator: http://en.wikipedia.org/wiki/Modulo_operation

Generally, How do I prevent integer overflow from happening in C language?

Whenever you declare an integer variable:

  1. Actually consider how large/small a number it will ever contain.
  2. Actually consider if it needs to be signed or unsigned. Unsigned is usually less problematic.
  3. Pick the smallest type of intn_t or uintn_t types from stdint.h that will satisfy the above (or the ...fast_t etc flavours if you wish).
  4. If needed, come up with integer constants that contain the maximum and/or minimum value the variable will hold and check against those whenever you do arithmetic.

That is, don't just aimlessly spam int all over your code without a thought.

Signed types can be problematic for other reasons than overflow too, namely whenever you need to do bitwise arithmetic. To avoid over/underflow and accidental signed bitwise arithmetic, you also need to know of the various implicit integer type promotion rules.



Is integer overflow going to get me hacked like buffer overflow or etc?

Not really, but any bug can of course be exploited if someone is aware of it - as you can see in almost every single computer game.



Related Topics



Leave a reply



Submit