Std::Vector::Resize() Vs. Std::Vector::Reserve()

std::vector::resize() vs. std::vector::reserve()

There are two different methods for a reason:

std::vector::reserve will allocate the memory but will not resize your vector, which will have a logical size the same as it was before.

std::vector::resize will actually modify the size of your vector and will fill any space with objects in their default state. If they are ints, they will all be zero.

After reserve, in your case, you will need a lot of push_backs to write to element 5.
If you don't wish to do that then in your case you should use resize.

One thing about reserve: if you then add elements with push_back, until you reach the capacity you have reserved, any existing references, iterators or pointers to data in your vector will remain valid. So if I reserve 1000 and my size is 5, the &vec[4] will remain the same until the vector has 1000 elements. After that, I can call push_back() and it will work, but the stored pointer of &vec[4] earlier may no longer be valid.

Why is resizing a vector faster than reserve and push_back?

push_back is costlier operation than index based access even if allocation has been taken care of before hand by reserve.

  1. push_back will need to take care of end pointer so that vector size can be computed correctly
  2. push_back will check for realloaction need. Essentially a branch prediction.
  3. push_back will cause copy (or move) of value to be pushed back. In case of int, it shouldn't cause performance difference.

If you see assembly conversion (Taken from godbolt link given in question), index operation is non branching sequence of few moves and shift operation while push_back is much more involved. In long running loop (1000000 in given example), this difference will matter. Compiler optimization level can definitely impact the difference.

For index operator []

    push    rbp
mov rbp, rsp
mov QWORD PTR [rbp-8], rdi
mov QWORD PTR [rbp-16], rsi
mov rax, QWORD PTR [rbp-8]
mov rax, QWORD PTR [rax]
mov rdx, QWORD PTR [rbp-16]
sal rdx, 2
add rax, rdx
pop rbp
ret

For push_back

    push    rbp
mov rbp, rsp
sub rsp, 16
mov QWORD PTR [rbp-8], rdi
mov QWORD PTR [rbp-16], rsi
mov rax, QWORD PTR [rbp-8]
mov rdx, QWORD PTR [rax+8]
mov rax, QWORD PTR [rbp-8]
mov rax, QWORD PTR [rax+16]
cmp rdx, rax
je .L73
mov rax, QWORD PTR [rbp-8] // When allocation is not needed
mov rcx, QWORD PTR [rax+8]
mov rax, QWORD PTR [rbp-8]
mov rdx, QWORD PTR [rbp-16]
mov rsi, rcx
mov rdi, rax
call void std::allocator_traits<std::allocator<int> >::construct<int, int const&>(std::allocator<int>&, int*, int const&)
mov rax, QWORD PTR [rbp-8]
mov rax, QWORD PTR [rax+8]
lea rdx, [rax+4]
mov rax, QWORD PTR [rbp-8]
mov QWORD PTR [rax+8], rdx
jmp .L75
.L73: // When allocation is needed
mov rax, QWORD PTR [rbp-8]
mov rdi, rax
call std::vector<int, std::allocator<int> >::end()
mov rcx, rax
mov rdx, QWORD PTR [rbp-16]
mov rax, QWORD PTR [rbp-8]
mov rsi, rcx
mov rdi, rax
.L75:
nop
leave
ret

Choice between vector::resize() and vector::reserve()

The two functions do vastly different things!

The resize() method (and passing argument to constructor is equivalent to that) will insert or delete appropriate number of elements to the vector to make it given size (it has optional second argument to specify their value). It will affect the size(), iteration will go over all those elements, push_back will insert after them and you can directly access them using the operator[].

The reserve() method only allocates memory, but leaves it uninitialized. It only affects capacity(), but size() will be unchanged. There is no value for the objects, because nothing is added to the vector. If you then insert the elements, no reallocation will happen, because it was done in advance, but that's the only effect.

So it depends on what you want. If you want an array of 1000 default items, use resize(). If you want an array to which you expect to insert 1000 items and want to avoid a couple of allocations, use reserve().

EDIT: Blastfurnace's comment made me read the question again and realize, that in your case the correct answer is don't preallocate manually. Just keep inserting the elements at the end as you need. The vector will automatically reallocate as needed and will do it more efficiently than the manual way mentioned. The only case where reserve() makes sense is when you have reasonably precise estimate of the total size you'll need easily available in advance.

EDIT2: Ad question edit: If you have initial estimate, then reserve() that estimate. If it turns out to be not enough, just let the vector do it's thing.

Why does std::vector reserve not double its capacity, while resize does?

As far as I can tell, neither resize nor reserve is required to have the demonstrated behaviour. Both are however allowed such behaviour although both could either allocate the exact amount, and both could multiply the previous allocation as far as the standard is concerned.

Each allocation strategies have their advantages. The advantage of allocating exact amount is that it has no memory overhead when the maximum allocation is known beforehand. The advantage of multiplying is that it maintains the constant amortized property when mixed with end-insertion operations.

The approach chosen by the tested implementations has the advantage that it allows both strategies when resizing. To use one strategy, one can reserve and then resize. To use the other, just resize. Of course, one has to be aware of the unspecified behaviour to take advantage of this. This advantage may or might not be the reasoning behind the choice of these implementations.

One might consider it a failure of the vector API, as specified in the standard, that expressing the intended reallocation behaviour is not possible (in a way that is guaranteed by the standard).

resize and reserve in vector

The specification of std::vectors operations probably provide most of the answer to this question.

Firstly, resize() is specified as being void resize(size_type n, value_type val = value_type()), with requirements;

  • If n is less than the current size, the additional elements then the additional elements are removed and destroyed (i.e. if elements have a destructor, the destructor is invoked).
  • If n is greater than the current container size, the content is expanded by inserting at the end as many elements as needed to reach the size of n. If val is specified, the new elements are initialized as copies of val, otherwise, they are value-initialized.
  • If n is also greater (emphasis mine) than the current container capacity, an automatic reallocation of the allocated storage space takes place.

Note that nothing whatsoever is said about reducing the container capacity if its size is reduced.

Now, reserve(size_type n) is specified as requesting that the vector capacity be n or more elements, with requirements

  • If n is greater than the current vector capacity, the function causes
    the container to reallocate its storage increasing its capacity to n
    (or greater).
  • In all other cases, the function call does not cause a reallocation and the vector capacity is not affected.
  • This function has no effect on the vector size and cannot alter its elements.

The second point here may be interpreted as saying that reserve() cannot be used to reduce capacity.

Note: I'm paraphrasing in the above, but it is essentially a summary of what the standard requires of resize() and reserve().

Putting those points together: The only circumstance in which resize() (or any other operation that increases the size of the vector) is required to increase capacity of the vector is if the new size exceeds the current capacity.

Beyond that, what happens is a quality of implementation concern. However, one logical manner of implementation would be;

  • resize() (and other operations that increase size of the vector) never increases capacity unless the new size exceeds current capacity
  • reserve() never reduces capacity

In such an implementation, calling reserve() before doing operations that increase size of the vector would give a performance advantage (eliminating need for actual reallocation unless the size ever grows to exceed the capacity specified using reserve()).

Things would potentially be different if an implementation used a different approach, such as reducing capacity of a container at any stage (e.g. if a call of resize() reduces the size, it also reduces the capacity). This might be done in an attempt to minimise total memory usage of the container (i.e. not keeping additional capacity allocated for as long as the vector itself exists). This would give a performance hit if the size of the vector oscillates in size (e.g. increases in some circumstances, decreases in others).

what's wrong with vector::reserve?

This is a common mistake. std::vector::reserve does not create elements or change the size of the container; you're actually causing undefined behavior. reserve changes just the capacity. You are looking for std::vector::resize to change the size. Here's an example for clarity:

#include <iostream>
#include <vector>

int main() {
std::vector<int> ivec;
std::cout << ivec.size() << " - " << ivec.capacity() << '\n'; // 0 - 0
ivec.reserve(100);
std::cout << ivec.size() << " - " << ivec.capacity() << '\n'; // 0 - 100
ivec.resize(30);
std::cout << ivec.size() << " - " << ivec.capacity() << '\n'; // 30 - 100
}

resize versus push_back in std::vector : does it avoid an unnecessary copy assignment?

At least with GCC, it doesn't matter which you use (Results below). However, if you get to the point where you are having to worry about it, you should be using pointers or (even better) some form of smart pointers.. I would of course recommend the ones in the boost library.

If you wanted to know which was better to use in practice, I would suggest either push_back or reserve as resize will resize the vector every time it is called unless it is the same size as the requested size. push_back and reserve will only resize the vector if needed. This is a good thing as if you want to resize the vector to size+1, it may already be at size+20, so calling resize would not provide any benefit.

Test Code

#include <iostream>
#include <vector>

class Elem{
public:
Elem(){
std::cout << "Construct\n";
}
Elem(const Elem& e){
std::cout << "Copy\n";
}
~Elem(){
std::cout << "Destruct\n";
}
};


int main(int argc, char* argv[]){
{
std::cout << "1\n";
std::vector<Elem> v;
v.push_back(Elem());
}

{
std::cout << "\n2\n";
std::vector<Elem> v;
v.resize(v.size()+1);
}
}

Test Output

1
Construct
Copy
Destruct
Destruct

2
Construct
Copy
Destruct
Destruct

std::vectorT::resize(n) vs reserve(n) with operator[]

By definition "Undefined Behavior" means that the result that you see on the execution of that line is not defined and can/will change with different runs.

Is this just my compiler (g++ v5.2.0) giving me undefined, but nice,
behaviour?

The nice behavior can be a mix of how std::vector is implemented in the version you are compiling and the state of memory when your program was executed. The compiler has almost no role to play in showing a "nice behavior".

One line answer: What you are noticing is indeed undefined behavior. The runtime is free to give any output/behavior including shooting monkeys out of your monitor, on hitting an UB.

Will `resize` have any risk to reduce the vector capacity?

No, the vector's capacity will not be reduced by reserve or resize.

std::vector::reserve:

If new_cap is [not greater than capacity] ... no iterators or references are invalidated.

std::vector::resize

Vector capacity is never reduced when resizing to smaller size because that would invalidate all iterators, rather than only the ones that would be invalidated by the equivalent sequence of pop_back() calls.

Relevant part of the standard

std::vector reserve() and push_back() is faster than resize() and array index, why?

Does resize initialize the newly allocated vector where reserve just allocates but does not construct?

Yes.



Related Topics



Leave a reply



Submit