Signed VS. Unsigned Integers for Lengths/Counts

Signed vs. unsigned integers for lengths/counts

C++ uses unsigned values because they need the full range. On a 32-bit system, the language should make it possible to have a 4 GB vector, not just a 2 GB one. (the OS might not allow you to use all 4 GB, but the language itself doesn't want to get in your way)

In .NET, unsigned integers aren't CLS-compliant. You can use them (in some .NET languages), but it limits portability and compatibility. So for the base class library, they only use signed integers.

However, these are both edge cases. For most purposes, a signed int is big enough.
So as long as both offer the range you need, you can use both.

One advantage that signed integers sometimes have is that they make it easier to detect underflow. Suppose you're computing an array index, and because of some bad input, or perhaps a logic error in your program, you end up trying to access index -1.

With a signed integer, that is easy to detect. With unsigned, it would wrap around and become UINT_MAX. That makes it much harder to detect the error, because you expected a positive number, and you got a positive number.

So really, it depends. C++ uses unsigned because it needs the range. .NET uses signed because it needs to work with languages which don't have unsigned.

In most cases, both will work, and sometimes, signed may enable your code to detect errors more robustly.

Should signed or unsigned integers be used for sizes?

"In a talk Carruth showed how a uint8_t as a for loop index in bzip2 creates many more machine instructions on x86 than int8_t because it has to explicitly simulate the overflow with masks and shifts."

Well, if you can use either type, the for-range must be limited to [0, 127]. Just use int as the index type, then. It is by definition the natural type for basic math operations, and typically maps well to CPU registers.

Using types optimized for minimal storage will not generate the fastest math, no. That is not a surprise. You can't draw conclusions about signed versus unsigned based on such flawed setups.

"size_t - size_t gives ambiguous values"

Well, it doesn't, but it does use modular arithmetic. size_t(1)-size_t(2)==size_t(-1), but size_t(-1) is the largest possible value. This follows directly from the definition of modular math: x-1 < x, except when x-1 wraps around because x==0. (Or equivalently x+1>x except when x+1==0)

Calling abs(size_t(x)) is therefore also pointless since every size_t value is positive. And comparing signed integers against size_t is equally fraught with unintended consequences. Explicit casts are good, as they make the consequences clear.

But there is no universal solution to automatically figure out which cast should be applied. If a mechanical rule could be invented, we could have left that rule to the compiler. We haven't, because we can't. You, as a programmer will have to consider each case numerically.

When should I use UNSIGNED and SIGNED INT in MySQL?

UNSIGNED only stores positive numbers (or zero). On the other hand, signed can store negative numbers (i.e., may have a negative sign).

Here's a table of the ranges of values each INTEGER type can store:

MySQL INTEGER types and lengths
Source: http://dev.mysql.com/doc/refman/5.6/en/integer-types.html

UNSIGNED ranges from 0 to n, while signed ranges from about -n/2 to n/2.

In this case, you have an AUTO_INCREMENT ID column, so you would not have negatives. Thus, use UNSIGNED. If you do not use UNSIGNED for the AUTO_INCREMENT column, your maximum possible value will be half as high (and the negative half of the value range would go unused).

Should I use unsigned integers for counting members?

No, definitely not. Delphi idiom is to use integers here. Don't fight the language.
In a 32 bit environment you'll not have more elements in the list, except if you try to build a bitmap.

Let's be clear: every programmer who is going to have to use your code is going to hate you for using a Cardinal instead of an integer.

Could type punning signed to unsigned integers make bounds checking faster by eliminating the need for = comparison?

Yes, it's a perfectly valid optimization when you're testing a signed integer and the lower bound is zero. In fact it's such a common optimization that your compiler will almost certainly do it automatically; obfuscating your code by doing it yourself is very likely to be a pointless premature optimization.

I just tested this on GCC 4.9, and confirmed by inspecting the generated assembly code that it performs this optimization automatically at -O1 and above. I would expect all modern compilers to do the same.



Related Topics



Leave a reply



Submit