Why Does the C++ Standard Algorithm "Count" Return a Difference_Type Instead of Size_T

Why do std::count(_if) return iterator::difference_type instead of size_t?

In theory you may have a tremendous sequence whose number of elements can only be represented with 128 bits. Assuming the implementation supports a corresponding integer type, it is quite likely that size_t use a 64 bit type. However, the iterator for this huge sequence could use a 128 bit integer. Note, that the sequence isn't necessary represented in the memory of any individual computer. It may be split across multiple huge databases.

You might also have a relatively small computer supporting only a 32 bit type with reasonable performance. For full standard conformance it might have to support a 64 bit type but it can be desirable to rather support faster computations using a representation natively supported by the platform. That is, size_t may not be the optimal choice. When creating iterators you generally know what size needs to be supported.

Why does std::count return a signed integer?

It's actually a std::ptrdiff_t, which has to be a signed integer. It has to be signed because it can be used as the difference between two iterators, and that can of course be negative.

When to use std::size_t?

A good rule of thumb is for anything that you need to compare in the loop condition against something that is naturally a std::size_t itself.

std::size_t is the type of any sizeof expression and as is guaranteed to be able to express the maximum size of any object (including any array) in C++. By extension it is also guaranteed to be big enough for any array index so it is a natural type for a loop by index over an array.

If you are just counting up to a number then it may be more natural to use either the type of the variable that holds that number or an int or unsigned int (if large enough) as these should be a natural size for the machine.

size_t' vs 'container::size_type'

The standard containers define size_type as a typedef to Allocator::size_type (Allocator is a template parameter), which for std::allocator<T>::size_type is typically defined to be size_t (or a compatible type). So for the standard case, they are the same.

However, if you use a custom allocator a different underlying type could be used. So container::size_type is preferable for maximum generality.

C++ for-loop - size_type vs. size_t

The C++ Standard says,

 size_type  |  unsigned integral type  |  a type that can represent the size of the largest object in the
allocation model

Then it adds,

Implementations of containers
described in this International
Standard are permitted to assume that
their Allocator template parameter
meets the following two additional
requirements beyond those in Table 32.

  • The typedef members pointer, const_pointer, size_type, and
    difference_type are
    required to be T*,T const*, size_t, and ptrdiff_t, respectively

So most likely, size_type is a typedef of size_t.

And the Standard really defines it as,

template <class T> 
class allocator
{
public:
typedef size_t size_type;
//.......
};

So the most important points to be noted are :

  • size_type is unsigned integral, while int is not necessarily unsigned. :-)
  • it can represent the largest index, because it's unsigned.

The use of size_t in an array iterator

You should keep in mind the automatic conversion rules of the language.

Does this imply the iterator variable i should simply be a size_t?

Yes it does, because if size_t is larger than unsigned int and your array is actually larger than can be indexed with an unsigned int, then your variable (i) can never reach the size of the array.

Does this imply that any integer in any program must become functionally identified as to whether it will ever be used as an array index?

You try to make it sound drastic, while it's not. Why do you choose a variable as double and not float? Why would you make a variable as unsigned and one not? Why would you make a variable short while another is int? Of course, you always know what your variables are going to be used for, so you decide what types they should get. The choice of size_t is one among many and it's similarly decided.

In other words, every variable in a program should be functionally identified and given the correct type.

Does it imply any code using logic that develops the index programmatically should then create a new result value of type size_t, particularly if the logic relies on potentially signed integer values?

Not at all. First, if the variable can never have negative values, then it could have been unsigned int or size_t in the first place. Second, if the variable can have negative values during computation, then you should definitely make sure that in the end it's non-negative, because you shouldn't index an array with a negative number.

That said, if you are sure your index is non-negative, then casting it to size_t doesn't make any difference. C11 at 6.5.2.1 says (emphasis mine):

A postfix expression followed by an expression in square brackets [] is a subscripted
designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2th element of E1 (counting from zero).

Which means whatever type of index for which some_pointer + index makes sense, is allowed to be used as index. In other words, if you know your int has enough space to contain the index you are computing, there is absolutely no need to cast it to a different type.

Implementations of count_until and accumulate_until?

I was thinking of a combination of std::find_if with a state predicate:
(Pred is normal user predicate.)

template<class InputIt, class T, class Pred>
typename iterator_traits<InputIterator>::difference_type
count_until(InputIt begin, InputIt end, const T& value, Pred pred)
{
typename iterator_traits<InputIterator>::difference_type count = 0;
auto internal_pred = [&count, &value, &pred](decltype(*begin) elem) {
return elem == value && pred(++count);
};

std::find_if(begin, end, internal_pred);
return count;
}

template<class InputIt, class T, class Pred>
T accumulate_until(InputIt begin, InputIt end, T value, Pred pred)
{
auto internal_pred = [&value, &pred] (const T& t) {
value += t;
return pred(value);
};
std::find_if(begin, end, internal_pred);
return value;
}


Related Topics



Leave a reply



Submit