Does std::vector *have* to move objects when growing capacity? Or, can allocators reallocate ?
When std::vector<T>
runs out of capacity it has to allocate a new block. You have correctly covered the reasons.
IMO it would make sense to augment the allocator interface. Two of us tried to for C++11 and we were unable to gain support for it: [1] [2]
I became convinced that in order to make this work, an additional C-level API would be needed. I failed in gaining support for that as well: [3]
Why does a vector have to move its data members when reallocating
It would not be safe to merely copy the bytes. Imagine, for example, that your object has two members, p and d, and p is a pointer that points to d. If you just copy the bytes, you'd copy the value of p that points to the old location of d, which has been destroyed.
This is a simple example, but in general, the reason for C++ constructors, destructors, copy and move constructors, is to allow your object to be "smarter" than just a sequence of bytes would be. Member variables have meaning, and that meaning is understood by your code, not by the compiler.
c++ Vector, what happens whenever it expands/reallocate on stack?
You wrote
[...] copy itself to a new location [...]
which is not the way a vector works. The vector data is copied to a new location, not the vector itself.
My answer should give you an idea of how a vector is designed.
The common std::vector layout*
Note: The std::allocator
is actually likely to be an empty class and std::vector
will probably not contain an instance of this class. This may not be true for an arbitrary allocator.
In most implementations it consists of three pointers where
begin
points to the start of the data memory of the vector on the heap (always on the heap if notnullptr
)end
points one memory location past the last element of the vector data
->size() == end-begin
capacity
points on memory location past the last element of the vector memory ->capacity() == capacity-begin
A vector on the stack
We declare a variable of type std::vector<T,A>
where T
is any type and A
is an allocator type for T
(i.e. std::allocator<T>
).
std::vector<T, A> vect1;
How does this look like in memory?
As we see: Nothing happens on the heap but the variable occupies the memory that is necessary for all of its members on the stack.
There it is and it will stay there until vect1
goes out of scope, since vect1
is just an object like any other object of type double
, int
or whatever. It will sit there on its stack position and wait to get destroyed, regardless of how much memory it handles itself on the heap.
The pointers of vect1
do not point anywhere, since the vector is empty.
A vector on the heap
Now we need a pointer to a vector and use some dynamic heap allocation to create the vector.
std::vector<T, A> * vp = new std::vector<T, A>;
Let's again look at the memory.
We have our vp variable on the stack and our vector is on the heap now. Again the vector itself will not move on the heap since its size is constant. Only the pointers (begin
, end
, capacity
) will move to follow the data position in memory if a reallocation takes place. Let's have a look at that.
Pushing elements to a vector
Now we can start pushing elements to a vector. Let's look at vect1
.
T a;
vect1.push_back(a);
The variable vect1
is still where it has been but memory on the heap was allocated to contain one element of T
.
What happens if we add one further element?
vect1.push_back(a);
- The space allocated on the heap for the data elements will not be enough (since it is only one memory positions, yet).
- A new memory block will be allocated for two elements
- The first element will be copied/moved to the new storage.
- The old memory will be deallocated.
We see: The new memory location is different.
To have additional insight let's look at the situation if we destroy the last element.
vect1.pop_back();
The memory allocated won't change but the last element will have its destructor called and the end pointer moves one position down.
As you can see: capacity() == capacity-begin == 2
while size() == end-begin == 1
How does std::vector reallocates every time an item is inserted using a loop?
std::vector
does not reallocate every time an element is added. It reallocates when it runs out of capacity. And when it reallocates, it doesn't allocate space for just 1 more element. It typically allocates by some factor of the the current capacity. I believe VS uses a factor of 1.5, and some others use 2. It has to do this in order to ensure that push_back
has amortized O(1) complexity, which is a requirement of the standard.
If you know for certain exactly how many elements you are going to add to the vector over its lifetime, it is still a good idea to reserve though, imo. Some might consider that premature optimization. But it is such a simple thing to do, I consider not doing it to be premature pessimization.
Can std::vector avoid copying if allocator provides realloc semantics?
There's no provision in the standard library allocator interface for resizing an existing block of memory. There's just "allocate a block of size N" and "deallocate a block" semantics.
So, the answer to your question is "No, you can't do this with std::vector
"
You can, of course, write your own vector-like class that does this - and if this is something that is important to you, you should do so. Writing a container class is not really that hard.
Does declaring a vector with size offer any improvements over using push_back in C++
With
vector<int> Array(n);
you create a vector that contains n
elements, all memory needed for those elements is allocated immediately.
When you use e.g.
Array.push_back(value);
then the vector needs to be resized, which could mean the memory have to be reallocated and all the contents have to be copied to the new memory.
Instead of creating an array with a set size, you could instead preallocate (or reserve) memory:
vector<int> Array; // An empty vector
Array.reserve(n); // Reserve memory, but keep the size as zero (it's still empty)
Array.push_back(value); // No reallocation needed, size is now one
This is useful when you have a vector of objects that can't be default-constructed.
Important concepts to learn: The vector size and its capacity and what the difference between them is.
The capacity is the number of elements the vector have allocated memory for.
The size is the current number of elements in the vector.
It's quite common for the capacity to be different from the size. And it must always be true that capacity >= size
.
What does the default allocator do when a std::vector is resized (via either reserve() or resize())?
C++ allocators do not support anything like C's realloc
. Whenever vector
needs more memory, it has to allocate new storage, move from old to new, and deallocate the old.
Either way, realloc
wouldn't suit vector
. With typical allocators, realloc
will only save you a heavy copy operation if you are shrinking its size, or in some cases growing by only a few bytes. vector
doesn't ever shrink, and it only grows in very large steps.
Note that move support is a new behavior in C++ 2011. Previous versions will copy.
How do I use use std::allocator in place of realloc?
Let's say I'm writing a custom vector using the std::allocator to wrap new and delete.
In the general case (excluding specializations for PODs), I don't think you could use realloc
in any case. An arbitrary object, constructed at a specific memory location, might have internal pointers pointing to very specific addresses in relation to the address in which it was constructed. Simply moving it around (in the byte-copying sense) could break invariants.
Thus the alternative you mention is in general necessary. You would have to allocate a new array, move
(or possibly even copy
!) the objects to the new location, then deallocate the old array. Of course, this includes more than a single stage that can fail - another reason why you can't really reallocate
in the general case. Perhaps this is the reason that allocators never had this functionality in the first case - for array-based containers, you can't really use them in general (although you might be able to use them for POD specializations).
Related Topics
Why Can't I Define a Function Inside Another Function
How to Forward Declare a Class Which Is in a Namespace
Does Moving a Vector Invalidate Iterators
What's Up with the Thousands of Warnings in Standard Headers in Msvc -Wall
Removing a Non Empty Directory Programmatically in C or C++
Std::Vector, Default Construction, C++11 and Breaking Changes
How to Get Rid of the _Imp_ Prefix in the Linker in Vc++
Are Move Constructors Produced Automatically
Accessing a Protected Member of a Base Class in Another Subclass
Building a 32-Bit Float Out of Its 4 Composite Bytes
Tell Gdb to Skip Standard Files
Dll Redirection Using Manifests
Determining If a Number Is Prime
Why Does Printf("%F",0); Give Undefined Behavior