Choice Between Vector::Resize() and Vector::Reserve()

Choice between vector::resize() and vector::reserve()

The two functions do vastly different things!

The resize() method (and passing argument to constructor is equivalent to that) will insert or delete appropriate number of elements to the vector to make it given size (it has optional second argument to specify their value). It will affect the size(), iteration will go over all those elements, push_back will insert after them and you can directly access them using the operator[].

The reserve() method only allocates memory, but leaves it uninitialized. It only affects capacity(), but size() will be unchanged. There is no value for the objects, because nothing is added to the vector. If you then insert the elements, no reallocation will happen, because it was done in advance, but that's the only effect.

So it depends on what you want. If you want an array of 1000 default items, use resize(). If you want an array to which you expect to insert 1000 items and want to avoid a couple of allocations, use reserve().

EDIT: Blastfurnace's comment made me read the question again and realize, that in your case the correct answer is don't preallocate manually. Just keep inserting the elements at the end as you need. The vector will automatically reallocate as needed and will do it more efficiently than the manual way mentioned. The only case where reserve() makes sense is when you have reasonably precise estimate of the total size you'll need easily available in advance.

EDIT2: Ad question edit: If you have initial estimate, then reserve() that estimate. If it turns out to be not enough, just let the vector do it's thing.

std::vector::resize() vs. std::vector::reserve()

There are two different methods for a reason:

std::vector::reserve will allocate the memory but will not resize your vector, which will have a logical size the same as it was before.

std::vector::resize will actually modify the size of your vector and will fill any space with objects in their default state. If they are ints, they will all be zero.

After reserve, in your case, you will need a lot of push_backs to write to element 5.
If you don't wish to do that then in your case you should use resize.

One thing about reserve: if you then add elements with push_back, until you reach the capacity you have reserved, any existing references, iterators or pointers to data in your vector will remain valid. So if I reserve 1000 and my size is 5, the &vec[4] will remain the same until the vector has 1000 elements. After that, I can call push_back() and it will work, but the stored pointer of &vec[4] earlier may no longer be valid.

Vector Resize vs Reserve for nested vectors

Function reserve() simply allocates a contiguous region of memory big enough to hold the number of items you specify and moves the vector's old content into this new block, which makes sure no more reallocations for the vectors' storage will be done upon insertions as long as the specified capacity is not exceeded. This function is used to reduce the number of reallocations (which also invalidate iterators), but does not insert any new items at the end of your vector.

From the C++11 Standard, Paragraph 23.3.6.3/1 about reserve():

A directive that informs a vector of a planned change in size, so that it can manage the storage
allocation accordingly. After reserve(), capacity() is greater or equal to the argument of reserve if reallocation happens; and equal to the previous value of capacity() otherwise. Reallocation happens at this point if and only if the current capacity is less than the argument of reserve(). If an exception is thrown other than by the move constructor of a non-CopyInsertable type, there are no effects.

Notice that by doing prob.cols[i].push_back(idx, value); you are likely to get undefined behavior, since i is probably an out-of-bounds index.

On the other hand, function resize() does insert items at the end of your vector, so that the final size of the vector will be the one you specified (this means it can even erase elements, if you specify a size smaller than the current one). If you specify no second argument to a call to resize(), the newly inserted items will be value-initialized. Otherwise, they will be copy-initialized from the value you provide.

From the C++11 Standard, Paragraph 23.3.6.3/9 about resize():

If sz <= size(), equivalent to erase(begin() + sz, end());. If size() < sz, appends
sz - size() value-initialized elements to the sequence.

So to sum it up, the reason why accessing your vector after invoking resize() gives the expected result is that items are actually being added to the vector. On the other hand, since the call to reserve() does not add any item, subsequent accesses to non-existing elements will give you undefined behavior.

resize and reserve in vector

The specification of std::vectors operations probably provide most of the answer to this question.

Firstly, resize() is specified as being void resize(size_type n, value_type val = value_type()), with requirements;

If n is less than the current size, the additional elements then the additional elements are removed and destroyed (i.e. if elements have a destructor, the destructor is invoked).
If n is greater than the current container size, the content is expanded by inserting at the end as many elements as needed to reach the size of n. If val is specified, the new elements are initialized as copies of val, otherwise, they are value-initialized.
If n is also greater (emphasis mine) than the current container capacity, an automatic reallocation of the allocated storage space takes place.

Note that nothing whatsoever is said about reducing the container capacity if its size is reduced.

Now, reserve(size_type n) is specified as requesting that the vector capacity be n or more elements, with requirements

If n is greater than the current vector capacity, the function causes
the container to reallocate its storage increasing its capacity to n
(or greater).
In all other cases, the function call does not cause a reallocation and the vector capacity is not affected.
This function has no effect on the vector size and cannot alter its elements.

The second point here may be interpreted as saying that reserve() cannot be used to reduce capacity.

Note: I'm paraphrasing in the above, but it is essentially a summary of what the standard requires of resize() and reserve().

Putting those points together: The only circumstance in which resize() (or any other operation that increases the size of the vector) is required to increase capacity of the vector is if the new size exceeds the current capacity.

Beyond that, what happens is a quality of implementation concern. However, one logical manner of implementation would be;

resize() (and other operations that increase size of the vector) never increases capacity unless the new size exceeds current capacity
reserve() never reduces capacity

In such an implementation, calling reserve() before doing operations that increase size of the vector would give a performance advantage (eliminating need for actual reallocation unless the size ever grows to exceed the capacity specified using reserve()).

Things would potentially be different if an implementation used a different approach, such as reducing capacity of a container at any stage (e.g. if a call of resize() reduces the size, it also reduces the capacity). This might be done in an attempt to minimise total memory usage of the container (i.e. not keeping additional capacity allocated for as long as the vector itself exists). This would give a performance hit if the size of the vector oscillates in size (e.g. increases in some circumstances, decreases in others).

Why is resizing a vector faster than reserve and push_back?

push_back is costlier operation than index based access even if allocation has been taken care of before hand by reserve.

push_back will need to take care of end pointer so that vector size can be computed correctly
push_back will check for realloaction need. Essentially a branch prediction.
push_back will cause copy (or move) of value to be pushed back. In case of int, it shouldn't cause performance difference.

If you see assembly conversion (Taken from godbolt link given in question), index operation is non branching sequence of few moves and shift operation while push_back is much more involved. In long running loop (1000000 in given example), this difference will matter. Compiler optimization level can definitely impact the difference.

For index operator []

    push    rbp
    mov     rbp, rsp
    mov     QWORD PTR [rbp-8], rdi
    mov     QWORD PTR [rbp-16], rsi
    mov     rax, QWORD PTR [rbp-8]
    mov     rax, QWORD PTR [rax]
    mov     rdx, QWORD PTR [rbp-16]
    sal     rdx, 2
    add     rax, rdx
    pop     rbp
    ret

For push_back

    push    rbp
    mov     rbp, rsp
    sub     rsp, 16
    mov     QWORD PTR [rbp-8], rdi
    mov     QWORD PTR [rbp-16], rsi
    mov     rax, QWORD PTR [rbp-8]
    mov     rdx, QWORD PTR [rax+8]
    mov     rax, QWORD PTR [rbp-8]
    mov     rax, QWORD PTR [rax+16]
    cmp     rdx, rax
    je      .L73
    mov     rax, QWORD PTR [rbp-8] // When allocation is not needed
    mov     rcx, QWORD PTR [rax+8]
    mov     rax, QWORD PTR [rbp-8]
    mov     rdx, QWORD PTR [rbp-16]
    mov     rsi, rcx
    mov     rdi, rax
    call    void std::allocator_traits<std::allocator<int> >::construct<int, int const&>(std::allocator<int>&, int*, int const&)
    mov     rax, QWORD PTR [rbp-8]
    mov     rax, QWORD PTR [rax+8]
    lea     rdx, [rax+4]
    mov     rax, QWORD PTR [rbp-8]
    mov     QWORD PTR [rax+8], rdx
    jmp     .L75
.L73:   // When allocation is needed
    mov     rax, QWORD PTR [rbp-8]
    mov     rdi, rax
    call    std::vector<int, std::allocator<int> >::end()
    mov     rcx, rax
    mov     rdx, QWORD PTR [rbp-16]
    mov     rax, QWORD PTR [rbp-8]
    mov     rsi, rcx
    mov     rdi, rax
.L75:
    nop
    leave
    ret

When should we use reserve() of vector?

A vector has a capacity (as returned by capacity() and a size (as returned by size(). The first states how many elements a vector can hold, the second how many he does currently hold.

resize changes the size, reserve only changes the capacity.

See also the resize and reserve documentation.

As for the use cases:
Let's say you know beforehand how many elements you want to put into your vector, but you don't want to initialize them - that's the use case for reserve. Let's say your vector was empty before; then, directly after reserve(), before doing any insert or push_back, you can, of course, not directly access as many elements as you reserved space for - that would trigger the mentioned error (subscript out of range) - since the elements you are trying to access are not yet initialized; the size is still 0. So the vector is still empty; but if you choose the reserved capacity in such a way that it's higher or equal to the maximum size your vector will get, you are avoiding expensive reallocations; and at the same time you will also avoid the (in some cases expensive) initialization of each vector element that resize would do.

With resize, on the other hand, you say: Make the vector hold as many elements as I gave as an argument; initialize those whose indices are exceeding the old size, or remove the ones exceeding the given new size.

Note that reserve will never affect the elements currently in the vector (except their storage location if reallocation is needed - but not their values or their number)! Meaning that if the size of a vector is currently greater than what you pass to a call to the reserve function on that same vector, reserve will just do nothing.

See also the answer to this question: Choice between vector::resize() and vector::reserve()

std::vectorT::resize(n) vs reserve(n) with operator[] [duplicate]

By definition "Undefined Behavior" means that the result that you see on the execution of that line is not defined and can/will change with different runs.

Is this just my compiler (g++ v5.2.0) giving me undefined, but nice,
behaviour?

The nice behavior can be a mix of how std::vector is implemented in the version you are compiling and the state of memory when your program was executed. The compiler has almost no role to play in showing a "nice behavior".

One line answer: What you are noticing is indeed undefined behavior. The runtime is free to give any output/behavior including shooting monkeys out of your monitor, on hitting an UB.

std::vector resize vs reserve during fread

If the vector is initially empty, then output[0] invokes undefined behaviour (regardless of how much space you have reserved), and on some platforms it may throw an exception.

Choice Between Vector::Resize() and Vector::Reserve()