What Happen to Pointers When Vectors Need More Memory and Realocate Memory

What happen to pointers when vectors need more memory and realocate memory?

Short answer: Everything will be fine. Don't worry about this and get back to work.

Medium answer: Adding elements to or removing them from a vector invalidates all iterators and references/pointers (possibly with the exception of removing from the back). Simple as that. Don't refer to any old iterators and obtain new ones after such an operation. Example:

std::vector<int> v = get_vector();

int & a = v[6];
int * b = &v[7];
std::vector<int>::iterator c = v.begin();
std::advance(it, 8);

v.resize(100);

Now a, b and c are all invalid: You cannot use a, and you cannot dereference b or c.

Long answer: The vector keeps track of dynamic memory. When the memory is exhausted, it allocates a new, larger chunk elsewhere and copies (or moves) all the old elements over (and then frees up the old memory, destroying the old objects). Memory allocation and deallocation is done by the allocator (typically std::allocator<T>), which in turn usually invokes ::operator new() to fetch memory, which in turn usually calls malloc(). Details may vary and depend on your platform. In any event, any previously held references, pointers or iterators are no longer valid (presumably because they refer to the now-freed memory, though it's not specified in the standard why they're invalid).

When vectors are allocated, do they use memory on the heap or the stack?

vector<Type> vect;

will allocate the vector, i.e. the header info, on the stack, but the elements on the free store ("heap").

vector<Type> *vect = new vector<Type>;

allocates everything on the free store.

vector<Type*> vect;

will allocate the vector on the stack and a bunch of pointers on the free store, but where these point is determined by how you use them (you could point element 0 to the free store and element 1 to the stack, say).

What happens when I access a pointer when it's stored in a vector which is on the stack?

There's a few things to unpack here.

First off, let's say you have a vector<object*> as you state, and that it is declared with automatic storage duration.

void f()
{
// declared with automatic storage duration "on the stack"
std::vector<object*> my_objects;
}

A vector by it's nature stores a contiguous block of memory, which is essentially an array of n objects of which it can store, in our example the vector can contain 0..n object*, and the implementation will do that by having a pointer to the first element, and that contiguous block of memory is stored "on the heap".

void f()
{
// declared with automatic storage duration "on the stack"
std::vector<object*> my_objects;

// now, my_objects holds 10 object*, and the storage for them
// is allocated "on the heap"
my_objects.resize(10);
}

It gets interesting because when we're storing object* we can't know if it was allocated on the heap or not. Take the following example:

void f()
{
// declared with automatic storage duration "on the stack"
std::vector<object*> my_objects;

// now, my_objects holds 2 object*, and the storage for them
// is allocated "on the heap"
my_objects.resize(2);

auto dyn_obj = std::make_unique<object>();
object auto_obj;

my_objects[0] = dyn_obj.get();
my_objects[1] = &auto_obj;
}

Above, we have a situation where the storage for my_objects.data() is allocated on the heap, the object pointed to by my_objects[0] is allocated on the heap, while the object pointed to by my_objects[1] is not.

As in your example:

std::vector<object*> my_objects; // automatic storage duration
object* o = new object; // o has automatic storage duration
// while what it points to is "on the heap"

my_objects.push_back(o); // an allocation will happen
// because my_objects has to
// allocate storage to hold o

my_objects is "on the stack", as is o. When the scope containing these things is exited, they will be "destroyed". my_objects will run it's destructor, while o will "just go away".

The call to my_objects.push_back() will allocate memory "on the heap" to hold 1 * sizeof(object*) ( at least ), and will copy the value of o into that storage space.

Stack vs Heap - Should objects inside *vector be declared as pointers?

The std:vector as any other Standard Library container copies elements into itself, so it owns them. Thus, if you have a dynamically allocated std::vector the elements that you .push_back() will be copied into the memory managed by the std::vector, thus they will be copied onto the heap.

As a side note, in some cases std::vector may move elements if it is safe to do so, but the effect is the same - in the end, all the elements are under std::vector's jurisdiction.

why vector use less memory than pointers in this code?

I am going to guess this is an allocation issue. Allocation from the OS seems to be quite time consuming from what I have seen.

Just a guess but maybe the std::vector default allocator is grabbing a much larger contiguous block of memory from the OS and is drawing from that to satisfy smaller vector allocations?

This answer may provide some insight:

https://stackoverflow.com/a/29659791/3807729

I managed to reduce the time taken to run a test program simply by allocating, then deallocating a large std::vector before running the timing operations.

I am speculating that the C++ runtime system (in some implementations) may hold on to memory it has received from the OS even after it has been deallocated because grabbing small chunks from the OS each time is much more expensive.

Memory management when using vector

You shouldn't worry about performance of std::vector when you access its element only 60 times per second. By the way, in Release compilation mode std::vector::operator[] is being converted to a single lea opcode. In Debug mode it is decorated by some runtime range checks though.

c++ Vector, what happens whenever it expands/reallocate on stack?

You wrote

[...] copy itself to a new location [...]

which is not the way a vector works. The vector data is copied to a new location, not the vector itself.

My answer should give you an idea of how a vector is designed.

The common std::vector layout*

Note: The std::allocator is actually likely to be an empty class and std::vector will probably not contain an instance of this class. This may not be true for an arbitrary allocator.

std::vector layout

In most implementations it consists of three pointers where

  • begin points to the start of the data memory of the vector on the heap (always on the heap if not nullptr)
  • end points one memory location past the last element of the vector data
    -> size() == end-begin
  • capacity points on memory location past the last element of the vector memory -> capacity() == capacity-begin

A vector on the stack

We declare a variable of type std::vector<T,A> where T is any type and A is an allocator type for T (i.e. std::allocator<T>).

std::vector<T, A> vect1;

How does this look like in memory?

std::vector on the stack

As we see: Nothing happens on the heap but the variable occupies the memory that is necessary for all of its members on the stack.
There it is and it will stay there until vect1 goes out of scope, since vect1 is just an object like any other object of type double, int or whatever. It will sit there on its stack position and wait to get destroyed, regardless of how much memory it handles itself on the heap.

The pointers of vect1 do not point anywhere, since the vector is empty.

A vector on the heap

Now we need a pointer to a vector and use some dynamic heap allocation to create the vector.

std::vector<T, A> * vp = new std::vector<T, A>;

Let's again look at the memory.

std::vector on the heap

We have our vp variable on the stack and our vector is on the heap now. Again the vector itself will not move on the heap since its size is constant. Only the pointers (begin, end, capacity) will move to follow the data position in memory if a reallocation takes place. Let's have a look at that.

Pushing elements to a vector

Now we can start pushing elements to a vector. Let's look at vect1.

T a;
vect1.push_back(a);

std::vector after single push_back

The variable vect1 is still where it has been but memory on the heap was allocated to contain one element of T.

What happens if we add one further element?

vect1.push_back(a);

std::vector after second push

  • The space allocated on the heap for the data elements will not be enough (since it is only one memory positions, yet).
  • A new memory block will be allocated for two elements
  • The first element will be copied/moved to the new storage.
  • The old memory will be deallocated.

We see: The new memory location is different.

To have additional insight let's look at the situation if we destroy the last element.

vect1.pop_back();

The memory allocated won't change but the last element will have its destructor called and the end pointer moves one position down.

std::vector after 2x push and 1x pop

As you can see: capacity() == capacity-begin == 2 while size() == end-begin == 1

C++ : Vector Allocator behavior, memory allocation and smart pointers

Q: Is my understanding correct?

A: Your understanding is partially correct.

  • p1 and p2 are created on the stack, using the default no-argument constructor, that you've defined.
  • The Default allocator may be used to allocate more memory for p1 and p2 when you call push_back(), but will not always do so. It will never create default construct a new instance of Point though.

Q: If new Point objects are created by the Allocator, why I see only two lines of "Points created"?

A: New objects are not being created by the allocator - the allocator only allocates more memory, if needed. The objects that you insert in the vector are copy constructed. Because you have not created a copy constructor, the compiler has generated one for you.

Q: How does the Allocator assign original values to x,y fields of the newly created objects? Does it using raw memory copy?

A: As stated in the previous question, the allocator only allocates memory and does not create or destroy objects. The behavior of copying the fields is done by the copy constructor that gets invoked when you do a push_back. The automatically generated copy constructor will do a member-wise copy construction of each of the class' members. In your case, x and y are primitive types, so they'll be just raw memory copied. If members were complex objects, their copy constructors would be invoked.

Q: Is the shared pointer the recommended way to return a vector from a method?

A: This would depend on your use case, and is opinion based. My personal advice, and this is for all kind of objects is:

  • If your use cases allows it, return by value (that is, std::vector<Point> getPoints())
  • If you need dynamically allocated storage, or the object that you want to return can be nothing, because construction failed, return by std::unique_ptr. This applies to pretty much all factory functions that you might want to create. Even if you later want to share the ownership (see point 3), you can construct a shared_ptr by moving from a unique_ptr (std::shared_ptr<T> shared = std::move(unique) );
  • Avoid using shared_ptr unless you really need shared ownership of the ptr. shared_ptr are more complex to reason about, can create hard to debug cycles, leading to memory leaks, and are heavier in terms of performance (because of atomic operations relating to their refcount and additional allocated memory for a control block). If you think you need a shared_ptr, reconsider your design and think if you can use a unique_ptr instead.

How this works:

Internally, a std::vector is using memory allocated using the default allocator (or custom user provided, if you provided one) on the heap. This allocation happens behind the scenes, and is independent of the vector's size, and from the number of elements in the vector (but is always >= size()). You can get how many elements the vector has allocated storage for by using the capacity() function. When you call push_back(), what happens:

  1. If there is enough storage (as determined by capacity()) to hold one more element, the argument that you passed to push_back is copy constructed, using the copy constructor if using the push_back( const T& value )variant or moved from if using push_back( T&& value ), by using the move constructor .
  2. If there is no more memory (i.e the new size() > capacity), more memory is allocated that will be sufficient to hold the new elements. How much memory will be allocated is implementation defined. A common pattern is to double the amount of capacity that the vector previously had until a threshold, after which memory is allocated in buckets. You may use reserve() before inserting elements, to make sure that your vector will have enough capacity to hold at least as many elements without new allocations. After new memory has been allocated, the vector reallocates all existing elements into the new storage by either copying them, or moving them if they are not copy-insertable. This reallocation will invalidate all iterators and references to elements in the vector (caveat: the rules for when exactly copy vs move will be used when reallocating is a bit more complex, but this is the general case)

Allocating a vector vs. a pointer to a vector

What is the difference between the two other than the fact that you will have a larger chunk of memory allocated with the pointer to the vector?

  1. 'will have a larger chunk of memory allocated'
    This isn't necessarily true! The std::vector might choose a much larger default initial size for the internally managed data array than 10.
  2. 'What is the difference between the two'
    The main difference is that the 1st one is allocated on the local scopes stack,
    and the 2nd one (usually) goes to the heap. Note: The internally managed data array goes to the heap anyway!!

To ensure proper memory management when you really have to use a std::vector<float>* pointer allocated from the heap, I'd recommend the use of c++ smart pointers, e.g.:

std::unique_ptr<std::vector<float> > pV2(new std::vector<float>(10));

For more details have a look at the documentation of <memory>.

What do I need to do before deleting elements in a vector of pointers to dynamically allocated objects?

  1. Yes
  2. Vectors are implemented using template memory allocators that take care of the memory management for you, so they are somewhat special. But as a general rule of thumb, you don't have to call delete on variables that aren't declared with the new keyword because of the difference between stack and heap allocation. If stuff is allocated on the heap, it must be deleted (freed) to prevent memory leaks.
  3. No. You explicitly have to call delete myVec[index] as you iterate over all elements.

Ex:

for(int i = 0; i < myVec.size(); ++i)
delete myVec[i];

With that said, if you're planning on storing pointers in a vector, I strongly suggest using boost::ptr_vector which automatically takes care of the deletion.



Related Topics



Leave a reply



Submit