Amortized Analysis of Std::Vector Insertion

Amortized analysis of std::vector insertion

Assuming you mean push_back and not insertion, I believe that the important part is the multiply by some constant (as opposed to grabbing N more elements each time) and as long as you do this you'll get amortized constant time. Changing the factor changes the average case and worst case performance.

Concretely:
If your constant factor is too large, you'll have good average case performance, but bad worst case performance especially as the arrays get big. For instance, imagine doubling (2x) a 10000 size vector just because you have the 10001th element pushed. EDIT: As Michael Burr indirectly pointed out, the real cost here is probably that you'll grow your memory much larger than you need it to be. I would add to this that there are cache issues that affect speed if your factor is too large. Suffice it to say that there are real costs (memory and computation) if you grow much larger than you need.

However if your constant factor is too small, say (1.1x) then you're going to have good worst case performance, but bad average performance, because you're going to have to incur the cost of reallocating too many times.

Also, see Jon Skeet's answer to a similar question previously. (Thanks @Bo Persson)

A little more about the analysis: Say you have n items you are pushing back and a multiplication factor of M. Then the number of reallocations will be roughly log base M of n (log_M(n)). And the ith reallocation will cost proportional to M^i (M to the ith power). Then the total time of all the pushbacks will be M^1 + M^2 + ... M^(log_M(n)). The number of pushbacks is n, and thus you get this series (which is a geometric series, and reduces to roughly (nM)/(M-1) in the limit) divided by n. This is roughly a constant, M/(M-1).

For large values of M you will overshoot a lot and allocate much more than you need reasonably often (which I mentioned above). For small values of M (close to 1) this constant M/(M-1) becomes large. This factor directly affects the average time.

does std::vector::insert allocate spare capacity?

The actual language is

[vector.overview] A vector is a sequence container that supports (amortized) constant time insert and erase operations at the end; insert and erase in the middle take linear time

and more specifically

[vector.modifiers 1] Complexity: If reallocation happens, linear in the number of elements of the resulting vector; otherwise, linear in the number of elements inserted plus the distance to the end of the vector.

but we both agree insert is broadly linear.

When reallocation happens, the options are:

  1. reuse the same internal mechanism for growing the vector as in push_back():

    this is obviously still amortized constant time, so the linear-time element move still dominates

  2. grow the vector by just one element:

    since this is still linear time, it's still within the complexity allowed. However, it is actually more effort for the implementer, and objectively worse

I don't think I've ever seen an implementation that didn't just reuse some internal grow() method for both of these, but it would be technically legal to spend more effort doing something worse.

The exception to this reasoning is the range overload insert(iterator pos, InputIterator begin, InputIterator end) since, for a strict LegacyInputIterator, you can only do one pass.

If you can't pre-calculate the number of elements to grow by, any growth must be amortized constant time, or the overall complexity would be governed by N * distance(begin,end), which could be O(N²) and thus non-conformant.


tl;dr

  • push_back must allocate extra capacity to remain amortized constant time
  • insert(pos, begin, end) must use amortized constant-time growth for each element of (begin,end] to remain amortized linear time overall
  • insert(pos, value) does not need to allocate extra capacity in order to meet its complexity requirement, but it's probably more effort for the implementer to get a worse result

Amortized time of insertion in sorted array is O(n) and deletion is O(1)?

The idea is to associate with each entry in the array a Boolean called deleted. Deleting an item consists of setting deleted to true. When there are too many deleted items, compact them out. If you make your compaction threshold a fraction of the total size, you can pay for the compaction from all the deletions required to reach the compaction point.

Here's a sketch. It's incomplete but demonstrates the insertion and deletion algorithms.

class sorted_array
{
public:
typedef std::vector<std::pair<int, bool>>::iterator iterator;

iterator insert(int value)
{
auto item = std::make_pair(value, false);
return vec.insert(std::lower_bound(vec.begin(), vec.end(), item), item);
}

void erase(iterator pos)
{
pos->second = true; // deleted = true
deleted_count++;
if (deleted_count * 2 > vec.size())
{
vec.erase(std::remove_if(vec.begin(), vec.end(),
std::get<1, int, bool>), vec.end());
deleted_count = 0;
}
}

private:
size_t deleted_count = 0;
std::vector<std::pair<int, bool>> vec;
}

Insertion is O(n) as usual. When we insert the element, we also mark it as not deleted.

To delete a element, we merely mark it as deleted and bank two credits.

When more than half of the elements in the vector are deleted, then that means that we have at least as many credits as there are elements in the vector. That means we can afford to run the O(n) compaction.

To find an element, you run a traditional binary search, and then skip over deleted elements. Since at most half of the elements are deleted, the binary search operates on at most 2n elements, which means that it runs in O(log 2n) = O(log n) steps. There's a little bit of extra cost skipping forward past deleted items after the binary search completes, but some more cleverness in the data structure can reduce that to a constant. (Left as an exercise.)

Similarly, iterating over the collection will take at most 2n steps (because at most half of the elements are deleted), which is still O(n).

How does deque have an amortized constant Time Complexity

The usual implementation of a deque is basically a vector of pointers to fixed-sized nodes.

Allocating the fixed-size node clearly has constant complexity, so that's pretty easy to handle--you just amortize the cost of allocating a single node across the number of items in that node to get a constant complexity for each.

The vector of pointers part is what's (marginally) more interesting. When we allocate enough of the fixed-size nodes that the vector of pointers is full, we need to increase the size of the vector. Like std::vector, we need to copy its contents to the newly allocated vector, so its growth must follow a geometric (rather than arithmetic) progression. This means that we have more pointers to copy we do the copying less and less frequently, so the total time devoted to copying pointers remains constant.

As a side note: the "vector" part is normally treated as a circular buffer, so if you're using your deque as a queue, constantly adding to one end and removing from the other does not result in re-allocating the vector--it just means moving the head and tail pointers that keep track of which of the pointers are "active" at a given time.

What is Constant Amortized Time?

Amortised time explained in simple terms:

If you do an operation say a million times, you don't really care about the worst-case or the best-case of that operation - what you care about is how much time is taken in total when you repeat the operation a million times.

So it doesn't matter if the operation is very slow once in a while, as long as "once in a while" is rare enough for the slowness to be diluted away. Essentially amortised time means "average time taken per operation, if you do many operations". Amortised time doesn't have to be constant; you can have linear and logarithmic amortised time or whatever else.

Let's take mats' example of a dynamic array, to which you repeatedly add new items. Normally adding an item takes constant time (that is, O(1)). But each time the array is full, you allocate twice as much space, copy your data into the new region, and free the old space. Assuming allocates and frees run in constant time, this enlargement process takes O(n) time where n is the current size of the array.

So each time you enlarge, you take about twice as much time as the last enlarge. But you've also waited twice as long before doing it! The cost of each enlargement can thus be "spread out" among the insertions. This means that in the long term, the total time taken for adding m items to the array is O(m), and so the amortised time (i.e. time per insertion) is O(1).

How can I ensure amortized O(n) concatenation from Data.Vector?

I don't know the vector library well either, so I have to stick to general principles mostly. Benchmark it, run a sequence of adds similar to what you expect in your application and see what performance you get. If it's 'good enough', great, stick with the simple code. If not, before storing a (forall s. MVector s Int) in the state (which you can't directly, tuples can't hold forall-types, so you'd need to wrap it), try improving the add-behaviour by converting to mutable vectors and perform the concatenation there before freezing to get an immutable vector again, roughly

add v1 = do
S v2 <- get
let v3 = runST $ do
m1 <- unsafeThaw v2
m2 <- unsafeGrow m1 (length v1)
-- copy contents of v1 behind contents of v2
unsafeFreeze m2
put (S v3)

You may need to insert some strictness there. However, if unsafeGrow needs to copy, that will not guarantee amortized O(n) behaviour.

You can get amortized O(n) behaviour by

  1. storing the number of used slots in the state too
  2. if the new vector fits in the free space at the end, thaw, copy, freeze without growing
  3. if it doesn't fit in the free space, grow by at least a factor of 2, that guarantees that each element is copied at most twice on average


Related Topics



Leave a reply



Submit