How to Rely on the Reduce-Capacity Trick

Can we rely on the reduce-capacity trick?

I've always been taught that there is no guaranteed standard way to lower the capacity. All methods have been (and still are) implementation defined.

§ 23.2.1\8 says:

The expression a.swap(b), for containers a and b of a standard
container type other than array, shall exchange the values of a and b
without invoking any move, copy, or swap operations on the individual
container elements...

This guarantees that the internal pointers of vectors must be swapped.

However, I cannot find anything that guarantee on the capacity of a newly created vector.

§ 21.4.2\1 says that one of the basic_string default constructor's post conditions is that capacity() returns an unspecified value.

§ 21.4.2\3 says that one of the basic_string copy constructor's post conditions is that capacity() returns a value at least as big as size().

§ 21.4.6.8\2 says that string::swap runs in constant time, which (effectively) requires that the internal pointers are swapped.

As far as I can tell, a conforming implementation could have string::max_size() { return 4;}, and swapping all internals from one buffer to another would therefore be constant time. (vector can't do that though)

Obviously, take this all with a grain of salt. I'm quoting from the C++ draft from Feb28,'11, and I can't find specifications for the vector's copy constructor. Also, not finding evidence for is not the same as finding evidence against.

Swapping a vector with a copy of itself

That's a pattern for shrink-to-fit in C++03, where there is no such operation in the interface of the vector class. What the code does is creating a copy (hopefully the capacity of the vector will be close to the number of available elements) and then swaps it with the original vector. After the expression completes, the temporary (which now holds the original buffers) is discarded and the memory is released.

Consider:

std::vector<int> large;
large.reserve( 10000000 ); // might be the result of multiple push_back/erase
// large.capacity() >= 10000000
large.push_back( 1 ); // Make more explicit that 'large' might not be empty
std::vector<int>( large ).swap( large );
// large.capacity() is hopefully closer to 1

In C++11 the vector type has been modified to provide a shrink_to_fit operation that takes on that role. It is important to note that neither the old pattern nor shrink_to_fit are binding operations, that is, there is no guarantee on the capacity of the vector after the operation other than capacity() >= size().

C++ vector reduce allocation size

It will never shrink the allocated memory in the absence of explicit direction to do so.

In C++11 there is a shrink_to_fit call that will ask the implementation to do this, but it may not reduce the allocated memory. In prior versions you have to create a new copy and swap away the old one.

Why doesn't std::vector::resize() deallocate the memory when you want to reduce the size of the vector?

Reducing the capacity of a vector involves

  • allocating new storage
  • copying over the existing elements
  • deallocating old storage

None of this is for free. Really shrinking a vector has a cost associated to it, and it isn't clear that users should pay it by default.

Swapping a vector with a copy of itself

That's a pattern for shrink-to-fit in C++03, where there is no such operation in the interface of the vector class. What the code does is creating a copy (hopefully the capacity of the vector will be close to the number of available elements) and then swaps it with the original vector. After the expression completes, the temporary (which now holds the original buffers) is discarded and the memory is released.

Consider:

std::vector<int> large;
large.reserve( 10000000 ); // might be the result of multiple push_back/erase
// large.capacity() >= 10000000
large.push_back( 1 ); // Make more explicit that 'large' might not be empty
std::vector<int>( large ).swap( large );
// large.capacity() is hopefully closer to 1

In C++11 the vector type has been modified to provide a shrink_to_fit operation that takes on that role. It is important to note that neither the old pattern nor shrink_to_fit are binding operations, that is, there is no guarantee on the capacity of the vector after the operation other than capacity() >= size().

Is shrink_to_fit the proper way of reducing the capacity a `std::vector` to its size?

Do the arguments in the quotation hold?

Measure and you will know. Are you constrained in memory? Can you figure out the correct size up front? It will be more efficient to reserve than it will be to shrink after the fact. In general I am inclined to agree on the premise that most uses are probably fine with the slack.

If yes, what's the proper way of shrinking an STL container's capacity to its size (at least for std::vector).

The comment does not only apply to shrink_to_fit, but to any other way of shrinking. Given that you cannot realloc in place, it involves acquiring a different chunk of memory and copying over there regardless of what mechanism you use for shrinking.

And if there's a better way to shrink a container, what's the reason for the existence of shrink_to_fit after-all?

The request is non-binding, but the alternatives don't have better guarantees. The question is whether shrinking makes sense: if it does, then it makes sense to provide a shrink_to_fit operation that can take advantage of the fact that the objects are being moved to a new location. I.e., if the type T has a noexcept(true) move constructor, it will allocate the new memory and move the elements.

While you can achieve the same externally, this interface simplifies the operation. The equivalent to shrink_to_fit in C++03 would have been:

std::vector<T>(current).swap(current);

But the problem with this approach is that when the copy is done to the temporary it does not know that current is going to be replaced, there is nothing that tells the library that it can move the held objects. Note that using std::move(current) would not achieve the desired effect as it would move the whole buffer, maintaining the same capacity().

Implementing this externally would be a bit more cumbersome:

{
std::vector<T> copy;
if (noexcept(T(std::move(declval<T>())))) {
copy.assign(std::make_move_iterator(current.begin()),
std::make_move_iterator(current.end()));
} else {
copy.assign(current.begin(), current.end());
}
copy.swap(current);
}

Assuming that I got the if condition right... which is probably not what you want to write every time that you want this operation.

By how much does vector::resize increase capacity?

The standard makes no guarantees about the asymptotics of repeated calls to resize(). It is entirely feasible that the container will simply grow the capacity to precisely the required target size. In fact, this would probably be the desirable behaviour (i.e. the least wasteful) in the majority of standard use cases for resize() (e.g. where it's just used once).

If you're worried, just implement your own geometric growth:

if (index + 1 > v.size()
{
if (v.capacity() < index + 1)
{
v.reserve(2 * (index + 1)); // I had 2 * capacity() here first, but
// I think this version is better
}
v.resize(index + 1);
}


Related Topics



Leave a reply



Submit