Size VS Capacity of a Vector

size vs capacity of a vector?

Size is not allowed to differ between multiple compilers. The size of a vector is the number of elements that it contains, which is directly controlled by how many elements you put into the vector.

Capacity is the amount of total space that the vector has. Under the hood, a vector just uses an array. The capacity of the vector is the size of that array. This is always equal to or larger than the size. The difference between them is the number of elements that you can add to the vector before the array under the hood needs to be reallocated.

You should almost never care about the capacity. It exists to let people with very specific performance and memory constraints do exactly what they want.

Why Vector's size() and capacity() is different after push_back()

The Standard mandates that std::vector<T>::push_back() has amortized O(1) complexity. This means that the expansion has to be geometrically, say doubling the amount of storage each time it has been filled.

Simple example: sequentially push_back 32 ints into a std::vector<int>. You will store all of them once, and also do 31 copies if you double the capacity each time it runs out. Why 31? Before storing the 2nd element, you copy the 1st; before storing the 3rd, you copy elements 1-2, before storing the 5th, you copy 1-4, etc. So you copy 1 + 2 + 4 + 8 + 16 = 31 times, with 32 stores.

Doing the formal analysis shows that you get O(N) stores and copies for N elements. This means amortized O(1) complexity per push_back (often only a store without a copy, sometimes a store and a sequence of copies).

Because of this expansion strategy, you will have size() < capacity() most of the time. Lookup shrink_to_fit and reserve to learn how to control a vector's capacity in a more fine-grained manner.

Note: with geometrical growth rate, any factor larger than 1 will do, and there have been some studies claiming that 1.5 gives better performance because of less wasted memory (because at some point the reallocated memory can overwrite the old memory).

size() and capacity() of c++ vectors

I know that after each push_back() the capacity of the vector changes in exponential powers but in the above OUTPUT the capacity is still remaining same sometimes even after insertion.

When capacity is greater than the size after insertion, the capacity doesn't need to, and is guaranteed not to change.

This strategy allows sequential push back to have constant complexity (amortised).

Capacity of Vectors?

The capacity() of a vector is better expressed as "the amount of space currently set aside that can be used for storage", expressed in terms of the number of potential elements that can be stored. Note that like size(), this is the number of elements, not the number of bytes.

For example, a vector might have a size() of 3 and a capacity() of 4: this says that it is currently storing 3 elements, and has room for a maximum of 4 elements (total) before it will need to reallocate.

On the other hand, sizeof returns the number of bytes required for an object of the given type, and is determined at compile time. For example, if you want the number of bytes currently allocated by the vector for storage, you could do vector.capacity() * sizeof(vector[0]).

Why vector has different capacity and other than the size?

The reason is related to the very essence of the extension algorithm of the vector.
When initializing a vector, the number of extra capacity applied is 0.
In the i-th time an extension is needed, the vector copies its contain to a new vector, with capacity doubled then its current size.
This method makes the whole idea of size-changing array very efficient, since in amortized time (meaning the average time over N operations), we get O(1) insertion complexity.
You can see that after we add one more integer to the first vector, we get a capacity of 6. http://coliru.stacked-crooked.com/a/f084820652f025b8

What's the difference between len() and capacity()?

Growable vectors reserve space for future additions, hence the difference between allocated space (capacity) and actually used space (length).

This is explained in the standard library's documentation for Vec:

The capacity of a vector is the amount of space allocated for any future elements that will be added onto the vector. This is not to be confused with the length of a vector, which specifies the number of actual elements within the vector. If a vector's length exceeds its capacity, its capacity will automatically be increased, but its elements will have to be reallocated.

For example, a vector with capacity 10 and length 0 would be an empty vector with space for 10 more elements. Pushing 10 or fewer elements onto the vector will not change its capacity or cause reallocation to occur. However, if the vector's length is increased to 11, it will have to reallocate, which can be slow. For this reason, it is recommended to use Vec::with_capacity whenever possible to specify how big the vector is expected to get.

Does declaring a vector with size offer any improvements over using push_back in C++

With

vector<int> Array(n);

you create a vector that contains n elements, all memory needed for those elements is allocated immediately.

When you use e.g.

Array.push_back(value);

then the vector needs to be resized, which could mean the memory have to be reallocated and all the contents have to be copied to the new memory.


Instead of creating an array with a set size, you could instead preallocate (or reserve) memory:

vector<int> Array;  // An empty vector
Array.reserve(n); // Reserve memory, but keep the size as zero (it's still empty)
Array.push_back(value); // No reallocation needed, size is now one

This is useful when you have a vector of objects that can't be default-constructed.

Important concepts to learn: The vector size and its capacity and what the difference between them is.

The capacity is the number of elements the vector have allocated memory for.

The size is the current number of elements in the vector.

It's quite common for the capacity to be different from the size. And it must always be true that capacity >= size.

How does std::vector::reserve actually work?

vector::reserve does allocate memory, so your question about reserving memory without allocating is incorrect. The point is that reserving memory can be done without changing the vectors size. Basically a vector has two sizes, it's size and it's capacity. reserve allocates memory and changes the capacity, but not the size.

At any given time the following is true 0 <= size <= capacity. The capacity reflects the amount of memory allocated, the size reflects the number of constructed elements in that memory.

C++ access vector beyond size() and under capacity()

It isn't valid to access data()[n] if n >= size(). Per [vector.data] std::vector::data

Returns: A pointer such that [data(), data() + size()) is a valid range. For a non-empty vector, data() == addressof(front()).

So it is only valid to access data with a value in the range of [0, size()).


Generally then memory between data() + size() - 1 and data + capacity() is uninitialized. If you read from that uninitialized memory that is undefined behavior. If you have an object that has non trivial initialization then you can't even assign a value to it since there isn't an object actually in that spot, just the space for one. You could probably get away with doing things in the uninitialized range, but your violating the contract with std::vector and it might get angry if you do ;)



Related Topics



Leave a reply



Submit