When Does a Std::Vector Reallocate Its Memory Array

When does a std::vector reallocate its memory array?

From C++ standard 23.2.4.2:

size_type capacity() const;

Returns: The total number of elements that the vector can hold without requiring reallocation.

Also from Standard

Notes: Reallocation invalidates all the references, pointers, and iterators referring to the elements in the
sequence. It is guaranteed that no reallocation takes place during insertions that happen after a call to
reserve() until the time when an insertion would make the size of the vector greater than the size
specified in the most recent call to reserve().

So yes, you can be sure.

Edit: As @Bo Persson mentioned there is a catch. Standard doesn't say anything if we never call reserve() . However in practice it works well, because no implementation will care to remember if you called reserve, or not. I believe that this is bug. And as @Martin mentioned in his answer in C++0x draft it is corrected.

c++ Vector, what happens whenever it expands/reallocate on stack?

You wrote

[...] copy itself to a new location [...]

which is not the way a vector works. The vector data is copied to a new location, not the vector itself.

My answer should give you an idea of how a vector is designed.

The common std::vector layout*

Note: The std::allocator is actually likely to be an empty class and std::vector will probably not contain an instance of this class. This may not be true for an arbitrary allocator.

std::vector layout

In most implementations it consists of three pointers where

  • begin points to the start of the data memory of the vector on the heap (always on the heap if not nullptr)
  • end points one memory location past the last element of the vector data
    -> size() == end-begin
  • capacity points on memory location past the last element of the vector memory -> capacity() == capacity-begin

A vector on the stack

We declare a variable of type std::vector<T,A> where T is any type and A is an allocator type for T (i.e. std::allocator<T>).

std::vector<T, A> vect1;

How does this look like in memory?

std::vector on the stack

As we see: Nothing happens on the heap but the variable occupies the memory that is necessary for all of its members on the stack.
There it is and it will stay there until vect1 goes out of scope, since vect1 is just an object like any other object of type double, int or whatever. It will sit there on its stack position and wait to get destroyed, regardless of how much memory it handles itself on the heap.

The pointers of vect1 do not point anywhere, since the vector is empty.

A vector on the heap

Now we need a pointer to a vector and use some dynamic heap allocation to create the vector.

std::vector<T, A> * vp = new std::vector<T, A>;

Let's again look at the memory.

std::vector on the heap

We have our vp variable on the stack and our vector is on the heap now. Again the vector itself will not move on the heap since its size is constant. Only the pointers (begin, end, capacity) will move to follow the data position in memory if a reallocation takes place. Let's have a look at that.

Pushing elements to a vector

Now we can start pushing elements to a vector. Let's look at vect1.

T a;
vect1.push_back(a);

std::vector after single push_back

The variable vect1 is still where it has been but memory on the heap was allocated to contain one element of T.

What happens if we add one further element?

vect1.push_back(a);

std::vector after second push

  • The space allocated on the heap for the data elements will not be enough (since it is only one memory positions, yet).
  • A new memory block will be allocated for two elements
  • The first element will be copied/moved to the new storage.
  • The old memory will be deallocated.

We see: The new memory location is different.

To have additional insight let's look at the situation if we destroy the last element.

vect1.pop_back();

The memory allocated won't change but the last element will have its destructor called and the end pointer moves one position down.

std::vector after 2x push and 1x pop

As you can see: capacity() == capacity-begin == 2 while size() == end-begin == 1

When std::vector reallocate its memory array, is copy constructor or move constructor used?

If the move-constructor exists and is noexcept then it is used. Otherwise the copy-constructor is used.

Using a move-constructor that might throw is undesirable as it might happen that some objects are moved to the new storage and then an exception prevents the rest of the objects being moved.

The cppreference.com site does say that if the object is non-copyable , but has a non-noexcept move constructor, then it will use that move-constructor, with "unspecified behaviour" if an exception is thrown. I guess that means elements may be lost from the vector.

How does std::vector reallocates every time an item is inserted using a loop?

std::vector does not reallocate every time an element is added. It reallocates when it runs out of capacity. And when it reallocates, it doesn't allocate space for just 1 more element. It typically allocates by some factor of the the current capacity. I believe VS uses a factor of 1.5, and some others use 2. It has to do this in order to ensure that push_back has amortized O(1) complexity, which is a requirement of the standard.

If you know for certain exactly how many elements you are going to add to the vector over its lifetime, it is still a good idea to reserve though, imo. Some might consider that premature optimization. But it is such a simple thing to do, I consider not doing it to be premature pessimization.

Does declaring a vector with size offer any improvements over using push_back in C++

With

vector<int> Array(n);

you create a vector that contains n elements, all memory needed for those elements is allocated immediately.

When you use e.g.

Array.push_back(value);

then the vector needs to be resized, which could mean the memory have to be reallocated and all the contents have to be copied to the new memory.


Instead of creating an array with a set size, you could instead preallocate (or reserve) memory:

vector<int> Array;  // An empty vector
Array.reserve(n); // Reserve memory, but keep the size as zero (it's still empty)
Array.push_back(value); // No reallocation needed, size is now one

This is useful when you have a vector of objects that can't be default-constructed.

Important concepts to learn: The vector size and its capacity and what the difference between them is.

The capacity is the number of elements the vector have allocated memory for.

The size is the current number of elements in the vector.

It's quite common for the capacity to be different from the size. And it must always be true that capacity >= size.

Why does a vector have to move its data members when reallocating

It would not be safe to merely copy the bytes. Imagine, for example, that your object has two members, p and d, and p is a pointer that points to d. If you just copy the bytes, you'd copy the value of p that points to the old location of d, which has been destroyed.

This is a simple example, but in general, the reason for C++ constructors, destructors, copy and move constructors, is to allow your object to be "smarter" than just a sequence of bytes would be. Member variables have meaning, and that meaning is understood by your code, not by the compiler.

If std::vector reallocates objects to new memory by using move constructor then why does it have to call the destructor on the original objects?

To end an object's lifetime, its destructor must be called.
Also, what happens when an object it moved is up to the implementation of the class. You have two objects and the move constructor is allowed to move resources around how it sees fit.

An example of a simple string class which, when moving, leaves the moved-from object in a valid state:

class string {
public:
string(const char* s) : m_len(std::strlen(s)), m_data(new char[m_len + 1]) {
std::copy_n(s, m_len + 1, m_data);
}
string(string&& rhs) :
m_len(std::exchange(rhs.m_len, 0)),
m_data(std::exchange(rhs.m_data, new char[1]{}))
{}
~string() {
delete[] m_data;
}

private:
size_t m_len;
char* m_data;
};

Without calling the destructor of the moved-from object, it would leak. An implementation like above is not uncommon. Just because an object has been move-from, it doesn't mean that it's void of resources to free.

Does std::vector::reserve reallocate the internal array?

Don't do it; it's undefined behaviour and simply not allowed. Instead, you should do the almost equally expensive resize() operation and then pass the data() pointer.

The only added cost comes from the zeroing out of the memory. Unfortunately there is no standard library container that handles uninitialized dynamic storage short of a std::unique_ptr<int[]>(new int[m]). The cost of the zeroing out is very small, though (but it may be conceptually annoying because you know that you are going to overwrite the data). I guess in a high-performance context you could give the unique-pointer approach a try. (Note that array-new[] for fundamental types is typically entirely equivalent to ::operator new() or malloc()).

What does std::vector look like in memory?

It roughly looks like this (excuse my MS Paint masterpiece):

vector memory layout

The std::vector instance you have on the stack is a small object containing a pointer to a heap-allocated buffer, plus some extra variables to keep track of the size and and capacity of the vector.


So it seems as though when I push_back() to the numbers vector, its older elements change their location.

The heap-allocated buffer has a fixed capacity. When you reach the end of the buffer, a new buffer will be allocated somewhere else on the heap and all the previous elements will be moved into the new one. Their addresses will therefore change.


Does it maybe store them together, but moves them all together, when more space is needed?

Roughly, yes. Iterator and address stability of elements is guaranteed with std::vector only if no reallocation takes place.


I am aware, that std::vector is a contiguous container only since C++17

The memory layout of std::vector hasn't changed since its first appearance in the Standard. ContiguousContainer is just a "concept" that was added to differentiate contiguous containers from others at compile-time.



Related Topics



Leave a reply



Submit