Why Is Size_T Unsigned

Why is size_t unsigned?

size_t is unsigned for historical reasons.

On an architecture with 16 bit pointers, such as the "small" model DOS programming, it would be impractical to limit strings to 32 KB.

For this reason, the C standard requires (via required ranges) ptrdiff_t, the signed counterpart to size_t and the result type of pointer difference, to be effectively 17 bits.

Those reasons can still apply in parts of the embedded programming world.

However, they do not apply to modern 32-bit or 64-bit programming, where a much more important consideration is that the unfortunate implicit conversion rules of C and C++ make unsigned types into bug attractors, when they're used for numbers (and hence, arithmetical operations and magnitude comparisions). With 20-20 hindsight we can now see that the decision to adopt those particular conversion rules, where e.g. string( "Hi" ).length() < -3 is practically guaranteed, was rather silly and impractical. However, that decision means that in modern programming, adopting unsigned types for numbers has severe disadvantages and no advantages – except for satisfying the feelings of those who find unsigned to be a self-descriptive type name, and fail to think of typedef int MyType.

Summing up, it was not a mistake. It was a decision for then very rational, practical programming reasons. It had nothing to do with transferring expectations from bounds-checked languages like Pascal to C++ (which is a fallacy, but a very very common one, even if some of those who do it have never heard of Pascal).

is size_t always unsigned?

Yes. It's usually defined as something like the following (on 32-bit systems):

typedef unsigned int size_t;

Reference:

C++ Standard Section 18.1 defines size_t is in <cstddef> which is described in C Standard as <stddef.h>.

C Standard Section 4.1.5 defines size_t as an unsigned integral type of the result of the sizeof operator

Difference between size_t and unsigned int?

if it is use to represent non negative value so why we not using unsigned int instead of size_t

Because unsigned int is not the only unsigned integer type. size_t could be any of unsigned char, unsigned short, unsigned int, unsigned long or unsigned long long, depending on the implementation.

Second question is that size_t and unsigned int are interchangeable or not and if not then why?

They aren't interchangeable, for the reason explained above ^^.

And can anyone give me a good example of size_t and its brief working ?

I don't quite get what you mean by "its brief working". It works like any other unsigned type (in particular, like the type it's typedeffed to). You are encouraged to use size_t when you are describing the size of an object. In particular, the sizeof operator and various standard library functions, such as strlen(), return size_t.

Bonus: here's a good article about size_t (and the closely related ptrdiff_t type). It reasons very well why you should use it.

What's the difference with SIZE_T and unsigned long?

SIZE_T is a Windows datatype, not a standard type. As for the difference, it is that SIZE_T may not be an unsigned long. Take a look at this page which lists Windows datatypes. The entry for SIZE_T says:

The maximum number of bytes to which a pointer can point. Use for a
count that must span the full range of a pointer.
This type is declared in BaseTsd.h as follows:
typedef ULONG_PTR SIZE_T;

And ULONG_PTR has the following entry:

An unsigned LONG_PTR.
This type is declared in BaseTsd.h as follows:
#if defined(_WIN64)
  typedef unsigned __int64 ULONG_PTR;
#else
  typedef unsigned long ULONG_PTR;
#endif

So it could be unsigned long, or it could be unsigned __int64. In your case ULONG_PTR and in turn SIZE_T are defined as unsigned long but this may not always be the case.

In your specific case, ULONG_PTR is defined as _W64 unsigned long, however I believe this is identical to unsigned __int64.

Why are size_t and unsigned int slower than int?

Inspecting the generated assembly for all 3 variants (int, unsigned, size_t), the big difference is that in the int case the loop in the sort function is unrolled and uses SSE instructions (working on 8 ints at a time), while in the other 2 cases it does neither. Interestingly enough, the sort function is called in the int case, while it is inlined into main in the other two (likely due to the increased size of the function due to the loop unrolling).

I'm compiling from the command line using cl /nologo /W4 /MD /EHsc /Zi /Ox, using dumpbin to get the disassembly, with toolset Microsoft (R) C/C++ Optimizing Compiler Version 19.12.25830.2 for x64.

I get execution times of around 30 seconds for int and 100 seconds for the other two.

Can I just use unsigned int instead of size_t?

size_t is the most correct type to use when describing the sizes of arrays and objects. It's guaranteed to be unsigned and is supposedly "large enough" to hold any object size for the given system. Therefore it is more portable to use for that purpose than unsigned int, which is in practice either 16 or 32 bits on all common computers.

So the most canonical form of a for loop when iterating over an array is actually:

for(size_t i=0; i<sizeof array/sizeof *array; i++)
{
  do_something(array[i]);
}

And not int i=0; which is perhaps more commonly seen even in some C books.

size_t is also the type returned from the sizeof operator. Using the right type might matter in some situations, for example printf("%u", sizeof obj); is formally undefined behavior, so it might in theory crash printf or print gibberish. You have to use %zu for size_t.

It is quite possible that size_t happens to be the very same type as unsigned long or unsigned long long or uint32_t or uint64_t though.

unsigned int vs. size_t

The size_t type is the unsigned integer type that is the result of the sizeof operator (and the offsetof operator), so it is guaranteed to be big enough to contain the size of the biggest object your system can handle (e.g., a static array of 8Gb).

The size_t type may be bigger than, equal to, or smaller than an unsigned int, and your compiler might make assumptions about it for optimization.

You may find more precise information in the C99 standard, section 7.17, a draft of which is available on the Internet in pdf format, or in the C11 standard, section 7.19, also available as a pdf draft.

Is size_t is always unsigned int

x86-64 and aarch64 (arm64) Linux, OS X and iOS all have size_t ultimately defined as unsigned long. (This is the LP64 model. This kind of thing is part of the platform's ABI which also defines things like function calling convention, etc. Other architectures may vary.) Even 32-bit x86 and ARM architectures use unsigned long on these OSes, although long happens to be the same representation as an int in those cases.

I'm fairly sure it's an unsigned __int64/unsigned long long on Win64. (which uses the LLP64 model)

Why Is Size_T Unsigned