How Disastrous Is Integer Overflow in C++

How disastrous is integer overflow in C++?

As pointed out by @Xeo in the comments (I actually brought it up in the C++ chat first):

Undefined behavior really means it and it can hit you when you least expect it.

The best example of this is here: Why does integer overflow on x86 with GCC cause an infinite loop?

On x86, signed integer overflow is just a simple wrap-around. So normally, you'd expect the same thing to happen in C or C++. However, the compiler can intervene - and use undefined behavior as an opportunity to optimize.

In the example taken from that question:

#include <iostream>
using namespace std;

int main(){
int i = 0x10000000;

int c = 0;
do{
c++;
i += i;
cout << i << endl;
}while (i > 0);

cout << c << endl;
return 0;
}

When compiled with GCC, GCC optimizes out the loop test and makes this an infinite loop.

Generally, How do I prevent integer overflow from happening in C language?

Whenever you declare an integer variable:

  1. Actually consider how large/small a number it will ever contain.
  2. Actually consider if it needs to be signed or unsigned. Unsigned is usually less problematic.
  3. Pick the smallest type of intn_t or uintn_t types from stdint.h that will satisfy the above (or the ...fast_t etc flavours if you wish).
  4. If needed, come up with integer constants that contain the maximum and/or minimum value the variable will hold and check against those whenever you do arithmetic.

That is, don't just aimlessly spam int all over your code without a thought.

Signed types can be problematic for other reasons than overflow too, namely whenever you need to do bitwise arithmetic. To avoid over/underflow and accidental signed bitwise arithmetic, you also need to know of the various implicit integer type promotion rules.



Is integer overflow going to get me hacked like buffer overflow or etc?

Not really, but any bug can of course be exploited if someone is aware of it - as you can see in almost every single computer game.

C/C++ unsigned integer overflow

It means the value "wraps around".

UINT_MAX + 1 == 0
UINT_MAX + 2 == 1
UINT_MAX + 3 == 2

.. and so on

As the link says, this is like the modulo operator: http://en.wikipedia.org/wiki/Modulo_operation

Integer overflow in intermediate arithmetic expression

According to the ISO C specification §6.2.5.9

A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

This means that both the would-be positive and negative overflows that seem to occur in your addition and subtraction respectively are actually performed as signed int so they are both well-defined. After the expression is evaluated, the result is then truncated back to an unsigned char since that's the left-hand result type.

Integer overflow/underflow

I think what you're looking for is

signed char i = -1;
unsigned char j = i;
printf("%u\n", j);

In 8 bits, the signed number -1 "wraps around" to the unsigned value 255.

You asked about size_t because, yes, it's an unsigned type, but it's typically 32 or even 64 bits. At those sizes, the number 255 is representable (and has the same representation) in both the signed and unsigned variants, so there isn't a negative number that corresponds to 255. But you can certainly see similar effects, using different values. For example, on a machine with 32-bit ints, this code:

unsigned int i = 4294967041;
int j = i;
printf("%d\n", j);

is likely to print -255. This value comes about because 2^32 - 255 = 4294967041.

Signed Integer value overflow in C++?

Because signed overflow/underflow are classified as undefined behavior, compilers are allowed to cheat and assume it can't happen (this came up during a Cppcon talk a year or two ago, but I forget the talk off the top of my head). Because you're doing the arithmetic and then checking the result, the optimizer gets to optimize away part of the check.

This is untested code, but you probably want something like the following:

if(b != 0) {
auto max_a = std::numeric_limits<int64_t>::max() / b;
if(max_a < a) {
throw std::runtime_error{"overflow"};
}
}
return a * b;

Note that this code doesn't handle underflow; if a * b can be negative, this check won't work.

Per Godbolt, you can see your version has the check completely optimized away.

Is signed integer overflow undefined behaviour or implementation defined?

Both references are correct, but they do not address the same issue.

int a = UINT_MAX; is not an instance of signed integer overflow, this definition involves a conversion from unsigned int to int with a value that exceeds the range of type int. As quoted from the École polytechnique's site, the C Standard defines the behavior as implementation-defined.

#include <limits.h>

int main(){
int a = UINT_MAX; // implementation defined behavior
int b = INT_MAX + 1; // undefined behavior
return 0;
}

Here is the text from the C Standard:

6.3.1.3 Signed and unsigned integers

  1. When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

  2. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

  3. Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Some compilers have a command line option to change the behavior of signed arithmetic overflow from undefined behavior to implementation-defined: gcc and clang support -fwrapv to force integer computations to be performed modulo the 232 or 264 depending on the signed type. This prevents some useful optimisations, but also prevents some counterintuitive optimisations that may break innocent looking code. See this question for some examples: What does -fwrapv do?

How is integer overflow exploitable?

It is definitely exploitable, but depends on the situation of course.

Old versions ssh had an integer overflow which could be exploited remotely. The exploit caused the ssh daemon to create a hashtable of size zero and overwrite memory when it tried to store some values in there.

More details on the ssh integer overflow: http://www.kb.cert.org/vuls/id/945216

More details on integer overflow: http://projects.webappsec.org/w/page/13246946/Integer%20Overflows

Why don't languages raise errors on integer overflow by default?

In C#, it was a question of performance. Specifically, out-of-box benchmarking.

When C# was new, Microsoft was hoping a lot of C++ developers would switch to it. They knew that many C++ folks thought of C++ as being fast, especially faster than languages that "wasted" time on automatic memory management and the like.

Both potential adopters and magazine reviewers are likely to get a copy of the new C#, install it, build a trivial app that no one would ever write in the real world, run it in a tight loop, and measure how long it took. Then they'd make a decision for their company or publish an article based on that result.

The fact that their test showed C# to be slower than natively compiled C++ is the kind of thing that would turn people off C# quickly. The fact that your C# app is going to catch overflow/underflow automatically is the kind of thing that they might miss. So, it's off by default.

I think it's obvious that 99% of the time we want /checked to be on. It's an unfortunate compromise.

Why is unsigned integer overflow defined behavior but signed integer overflow isn't?

The historical reason is that most C implementations (compilers) just used whatever overflow behaviour was easiest to implement with the integer representation it used. C implementations usually used the same representation used by the CPU - so the overflow behavior followed from the integer representation used by the CPU.

In practice, it is only the representations for signed values that may differ according to the implementation: one's complement, two's complement, sign-magnitude. For an unsigned type there is no reason for the standard to allow variation because there is only one obvious binary representation (the standard only allows binary representation).

Relevant quotes:

C99 6.2.6.1:3:

Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.

C99 6.2.6.2:2:

If the sign bit is one, the value shall be modified in one of the following ways:

— the corresponding value with sign bit 0 is negated (sign and magnitude);

— the sign bit has the value −(2N) (two’s complement);

— the sign bit has the value −(2N − 1) (one’s complement).


Nowadays, all processors use two's complement representation, but signed arithmetic overflow remains undefined and compiler makers want it to remain undefined because they use this undefinedness to help with optimization. See for instance this blog post by Ian Lance Taylor or this complaint by Agner Fog, and the answers to his bug report.



Related Topics



Leave a reply



Submit