What the Heque Is Going on with the Memory Overhead of Std::Deque

What the heque is going on with the memory overhead of std::deque?

Look at the code for _DEQUESIZ (number of elements per block):

#define _DEQUESIZ   (sizeof (_Ty) <= 1 ? 16 \
    : sizeof (_Ty) <= 2 ? 8 \
    : sizeof (_Ty) <= 4 ? 4 \
    : sizeof (_Ty) <= 8 ? 2 : 1)    /* elements per block (a power of 2) */

It gets smaller if the element is larger. Only for elements larger than 8 bytes will you get the expected behavior (percentual decrease of overhead with increase of element size).

C++ STL queue memory usage compared to vector?

The std::queue is a container adaptor, not a container itself. So let's compare the overhead of some actual containers:

std::vector is very memory-efficient, it uses almost zero overhead. A std::vector<int> uses about 4 bytes per item, on most platforms.
std::list is very inefficient with memory, it will likely use two pointers of overhead per item. A std::list<int> uses about 24 bytes per item, on 64-bit platforms, and 12 bytes on 32-bit platforms.
std::deque is between the two, and it is the default container for std::queue. According to "what the heck is going on with the memory overhead of std::deque", the MSVC deque is a list of blocks each containing around 16 bytes, which is quite a bit of overhead if your queues contain one or two int each and you have a lot of queues.

Another factor that affects overhead is the efficiency of the allocator on your platform, which will color your results unless you can account for it. A 15x difference between two implementations is so large it's downright suspicious — it makes me wonder how you got those numbers.

In general, if your queues are very short, there is a lot of room for improvement over the other implementations. If you're okay with writing your own container, you could write a circular buffer container or use Boost's circular_buffer. A circular buffer combines the memory efficiency of std::vector with the CPU efficiency of std::deque for deque type operations. Kind of makes me wish it were in the STL to begin with. Oh well.

Footnote

The actual amounts of overhead will vary with the implementation.

What's the Big-O of a stack, queue, set, and deque?

This really depends on the implementation of each data structure. I'll go over a few, so you can get the idea of how to determine this.

Let's assume the Stack class is implemented using a Node (it can be implemented with an LinkedList, ArrayList, arrray, etc. as well).

Node top, bottom;
public Stack(Node n){
    top = bottom = n;
}

Stack has 3 primary methods: peek, push, and pop.

public int peek(){
    return top.value; //only return value
}

There wasn't much processing involved. It just returned a primitive value. This is O(1) for both time and space.

public void push(Node n){
    n.next = top;
    top = n;
}

Still no real processing. This is O(1) for both time and space. Let's skip pop() and try something more interesting. Let's try a method called contains(int v). It will search the stack to see if it contains a Node which contains a value equal to v.

public bool contains(int v){
    Node current = top;
    while(current != null){
        if(current.value == v){
            return true;
        }
        current = current.next;
    }
    return false;
}

Basically we'll move through the Node references until we find the value or we reach the end. In some cases, you'll find value early and in some cases later down the road. However, we care about the worst case! The worst possible case would be we have to check every single Node. Say there are n Nodes, then we have O(n).

You can apply these analysis skills to the other data structures, so you can solve the rest yourself. It's not too bad. Good luck. :)

How much memory does a stack take?

The most obvious answer is: it all depends. In reality, there are some things that can be said.

All of the containers incur a modest amount of overhead, of the order of (perhaps) a couple of pointers and a couple of integers, at most. You can find out how much if you really want to know, but it will be constant and quite small, depending on the implementation.

Where the C++ standard helps you is that it places requirements on the storage part of the container. Because iterators are (more or less) equivalent to pointers, stack elements are placed at consecutive memory locations with the same alignment as an array of the same items. You can predict this part.

And where it lets you down again is that there is no limit on the amount of 'slack' space. The container might contain 5 elements and have enough space for 500. This will again depend on the implementation.

So it all depends. You'll just have to try it and see.

Can Java 8 streams cause a O(1) memory reduction on unbounded data to become O(n) memory because of the underlying ForkJoin implementation

Thanks to Marko Topolnik and Holger for coming to the correct answer. Though neither posted an answer for me to accept, so I'll try to tie this up for posterity :)

The .skip(1) is very expensive on a parallel stream because it requires ordering to skip exactly the first entry, as per the Javadoc for Stream.skip()

Reading the first line of the BufferedReader before calling .lines() on it does successfully skip the first line in my implementation.

Then removing the .skip() solves the memory problem, and it is observed in JConsole to bounce around nicely and return to < 1GB on every garbage collection even if the program processes 1 billion lines. This is desired behaviour and is close enough to O(1) memory for my purposes.

Contrary to a suggestion above, the relative locations of .parallel() and .skip(1) do not matter, you cannot re-order them to make the .skip(1) happen "before" the .parallel(). The builder pattern suggests that ordering is important, and it is for other intermediate operations, but not for this one. I remember this subtlety from my OCP certification materials, but it doesn't appear to be in the Javadoc, hence no reference. I have, however, confirmed this experimentally by making the isolated change and observing the regression in JConsole, and associated OOM.

Do C++ Standard Library types implement exception-safe copy-assignment?

The standard specifies explicitely in 'verse' 21.4.1.2. that any other exceptions other than std::bad_length "shall have no effect".

What the Heque Is Going on with the Memory Overhead of Std::Deque