Size_T VS Int in C++ And/Or C

size_t vs int in C++ and/or C

In general, size_t should be used whenever you are measuring the size of something. It is really strange that size_t is only required to represent between 0 and SIZE_MAX bytes and SIZE_MAX is only required to be 65,535...

The other interesting constraints from the C++ and C Standards are:

  • the return type of sizeof() is size_t and it is an unsigned integer
  • operator new() takes the number of bytes to allocate as a size_t parameter
  • size_t is defined in <cstddef>
  • SIZE_MAX is defined in <limits.h> in C99 but not mentioned in C++98?!
  • size_t is not included in the list of fundamental integer types so I have always assumed that size_t is a type alias for one of the fundamental types: char, short int, int, and long int.

If you are counting bytes, then you should definitely be using size_t. If you are counting the number of elements, then you should probably use size_t since this seems to be what C++ has been using. In any case, you don't want to use int - at the very least use unsigned long or unsigned long long if you are using TR1. Or... even better... typedef whatever you end up using to size_type or just include <cstddef> and use std::size_t.

Difference between size_t and unsigned int?

if it is use to represent non negative value so why we not using unsigned int instead of size_t

Because unsigned int is not the only unsigned integer type. size_t could be any of unsigned char, unsigned short, unsigned int, unsigned long or unsigned long long, depending on the implementation.

Second question is that size_t and unsigned int are interchangeable or not and if not then why?

They aren't interchangeable, for the reason explained above ^^.

And can anyone give me a good example of size_t and its brief working ?

I don't quite get what you mean by "its brief working". It works like any other unsigned type (in particular, like the type it's typedeffed to). You are encouraged to use size_t when you are describing the size of an object. In particular, the sizeof operator and various standard library functions, such as strlen(), return size_t.

Bonus: here's a good article about size_t (and the closely related ptrdiff_t type). It reasons very well why you should use it.

What's the difference between size_t and int in C++?

From the friendly Wikipedia:

The stdlib.h and stddef.h header files define a datatype called size_t which is used to represent the size of an object. Library functions that take sizes expect them to be of type size_t, and the sizeof operator evaluates to size_t.

The actual type of size_t is platform-dependent; a common mistake is to assume size_t is the same as unsigned int, which can lead to programming errors, particularly as 64-bit architectures become more prevalent.

Also, check Why size_t matters

unsigned int vs. size_t

The size_t type is the unsigned integer type that is the result of the sizeof operator (and the offsetof operator), so it is guaranteed to be big enough to contain the size of the biggest object your system can handle (e.g., a static array of 8Gb).

The size_t type may be bigger than, equal to, or smaller than an unsigned int, and your compiler might make assumptions about it for optimization.

You may find more precise information in the C99 standard, section 7.17, a draft of which is available on the Internet in pdf format, or in the C11 standard, section 7.19, also available as a pdf draft.

What is size_t in C?

From Wikipedia:

According to the 1999 ISO C standard
(C99), size_t is an unsigned integer
type of at least 16 bit (see sections
7.17 and 7.18.3).

size_tis an unsigned data type
defined by several C/C++ standards,
e.g. the C99 ISO/IEC 9899 standard,
that is defined in stddef.h.1 It can
be further imported by inclusion of
stdlib.h as this file internally sub
includes stddef.h.

This type is used to represent the
size of an object. Library functions
that take or return sizes expect them
to be of type or have the return type
of size_t. Further, the most
frequently used compiler-based
operator sizeof should evaluate to a
constant value that is compatible with
size_t.

As an implication, size_t is a type guaranteed to hold any array index.

Can I just use unsigned int instead of size_t?

size_t is the most correct type to use when describing the sizes of arrays and objects. It's guaranteed to be unsigned and is supposedly "large enough" to hold any object size for the given system. Therefore it is more portable to use for that purpose than unsigned int, which is in practice either 16 or 32 bits on all common computers.

So the most canonical form of a for loop when iterating over an array is actually:

for(size_t i=0; i<sizeof array/sizeof *array; i++)
{
do_something(array[i]);
}

And not int i=0; which is perhaps more commonly seen even in some C books.

size_t is also the type returned from the sizeof operator. Using the right type might matter in some situations, for example printf("%u", sizeof obj); is formally undefined behavior, so it might in theory crash printf or print gibberish. You have to use %zu for size_t.

It is quite possible that size_t happens to be the very same type as unsigned long or unsigned long long or uint32_t or uint64_t though.

Why do we specify size_t array size in C rather than just using integers? And what are the advantages to do so?

size_t can store the maximum size of a theoretically possible object
of any type (including array).

size_t is commonly used for array indexing and loop counting. Programs
that use other types, such as unsigned int, for array indexing may
fail on, e.g. 64-bit systems when the index exceeds UINT_MAX or if it
relies on 32-bit modular arithmetic.

(source)

What this means is that size_t is guaranteed to be able to hold any size/index count on any platform you compile for.

It is usually defined based on the target platform you compile for. For example the intsafe.h header included in the Windows SDK defines it as follows:

#ifdef _WIN64
typedef __int64 ptrdiff_t;
typedef unsigned __int64 size_t;
#else
typedef int ptrdiff_t;
typedef unsigned int size_t;
#endif

Using size_t directly means that you don't have to worry about changing the data type you use for indexing or holding object sizes when you recompile for a different architecture (for example, x86 vs x86-64).

EDIT: As @Eric Postpischil mentions in the comments, that actual wording of the C standard is different from the explanation I linked from cppreference.

Looking at the standard we can see a few mentions regarding size_t:

6.5.3.4 The sizeof and alignof operators

[...]

5 The value of the result of both operators is implementation-defined,
and its type (an unsigned integer type) is size_t, defined in
<stddef.h> (and other headers)

[...]

7.19 Common definitions <stddef.h>

[...]

size_t
which is the unsigned integer type of the result of the sizeof operator;

The fact that size_t is the type of the value returned by the sizeof operator is the reason for which we can assume that "size_t can store the maximum size of a theoretically possible object of any type", but nowhere in the standard is this actually mandated. The standard also says the following about sizeof:

The sizeof operator yields the size (in bytes) of its operand, which
may be an expression or the parenthesized name of a type. The size is
determined from the type of the operand. The result is an integer. If
the type of the operand is a variable length array type, the operand
is evaluated; otherwise, the operand is not evaluated and the result
is an integer constant

My interpretation (which may be wrong) is that you can't have an object for which the size does not fit into a size_t because in that case you can't have a working sizeof operator.

EDIT 2:

I'll just quote Eric Postpischil here:

There are simply limits as to what we can do with computers, and
sometimes calculations go out of bounds. It is generally easy on
modern hardware to define size_t so that sizeof always works, and you
obviously cannot malloc a larger object, but you might construct one
with a static array, in some esoteric C implementation with 16-bit
size_t but 22-bit addresses.

comparing int with size_t

It's safe provided the int is zero or positive. If it's negative, and size_t is of equal or higher rank than int, then the int will be converted to size_t and so its negative value will instead become a positive value. This new positive value is then compared to the size_t value, which may (in a staggeringly unlikely coincidence) give a false positive. To be truly safe (and perhaps overcautious) check that the int is nonnegative first:

/* given int i; size_t s; */
if (i>=0 && i == s)

and to suppress compiler warnings:

if (i>=0 && (size_t)i == s)

Difference in results when using int and size_t

1. I am unable to understand a couple of things. First, how can adding a signed and an unsigned number cast the enter result into unsigned type?

This is defined by integer promotions and integer conversion rank.

6.3.1.8 p1: Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.

In this case unsigned has a higher rank than int, therefore int is promoted to unsigned.

The conversion of int ( -2 ) to unsigned is performed as described:

6.3.1.3 p2: Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type

2. If the result is indeed 0xFFFFFFFF of unsigned type, why in a 32 bit system, while adding it with ptr, will it be interpreted as ptr-1, given that the number is actually unsigned type and the leading 1 should not signify sign?

This is undefined behavior and should not be relied on, since C doesn't define pointer arithmetic overflow.

6.5.6 p8: If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.

3. Second, why is the result different in 64 bit system?

( This assumes( as does the picture ) that int and unsigned are 4 bytes. )

The result of A and B is the same as described in 1., then that result is added to the pointer. Since the pointer is 8 bytes and assuming the addition doesn't overflow( it still could if ptr had a large address, giving the same undefined behavior as in 2. ) the result is an address.

This is undefined behavior because the pointer points way outside of the bounds of the array.



Related Topics



Leave a reply



Submit