Why Are the Terms "Automatic" and "Dynamic" Preferred Over the Terms "Stack" and "Heap" in C++ Memory Management

Why are the terms automatic and dynamic preferred over the terms stack and heap in C++ memory management?

Automatic tells me something about the lifetime of an object: specifically that it is bound automatically to the enclosing scope, and will be destroyed automatically when that scope exits.

Dynamic tells me that the lifetime of an object is not controlled automatically by the compiler, but is under my direct control.

Stack is an overloaded name for a type of container, and for the related popular instruction pointer protocol supported by common call and ret instructions. It doesn't tell me anything about the lifetime of an object, except through a historical association to object lifetimes in C, due to popular stack frame conventions.
Note also that in some implementations, thread-local storage is on the stack of a thread, but is not limited to the scope of any single function.

Heap is again an overloaded name, indicating either a type of sorted container or a free-store management system. This is not the only free store available on all systems, and nor does it tell me anything concrete about the lifetime of an object allocated with new.

What and where are the stack and heap?

The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).

To answer your questions directly:

To what extent are they controlled by the OS or language runtime?

The OS allocates the stack for each system-level thread when the thread is created. Typically the OS is called by the language runtime to allocate the heap for the application.

What is their scope?

The stack is attached to a thread, so when the thread exits the stack is reclaimed. The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.

What determines the size of each of them?

The size of the stack is set when a thread is created. The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system).

What makes one faster?

The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation. Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast. Another performance hit for the heap is that the heap, being mostly a global resource, typically has to be multi-threading safe, i.e. each allocation and deallocation needs to be - typically - synchronized with "all" other heap accesses in the program.

A clear demonstration:
Sample Image

Image source: vikashazrati.wordpress.com

What is the difference between given two lines of dynamic memory allocation in C++ ? Do they both create 10 sized array?

For a simple example, let's talk about what happens on the stack.

int x(10)

This generally means assigning an value to an int named x.

int x[10]

This generally means creating an array named x of 10 elements.

So, when it come to dynamic memory, it's the same thing.

int* x=new int(10)

This creates a single integer on the heap and assigns it the value 10.

int* x=new int[10]

This creates an array of 10 integers on the heap.

A bit confused on exact meaning of dynamic memory allocation for C++

I will try to clear the confusion as much as I can. First of all, learn to separate low-level memory model concepts (stack, heap) from c++-level memory concepts. In the world of C++, stack and heap do not mean anything remotely resembling stack or heap in low-level model.

Low-level memory model

First, let's talk about low-level memory model. Traditionally, memory is split between 'stack' and 'heap' memory, which I will cover next.

Stack

The stack is managed by so-called 'stack pointer' CPU register - which always indicate the top of the stack and goes continuously from high-level memory addresses to low-level memory addresses. Since the top of the stack is always pointed to by the register, there is no need for any real memory management associated with stack - when you need more memory, you just decrease the value stored in the pointer - this your memory now and it is considered to be allocated for you. When you no longer need the memory, you increase the value - and the memory is 'free' now. Obviously, the problem with that approach is that it is not sustainable - you can not free (or allocate) memory within the block. So if you allocated memory for 3 objects, A, B, C and you no longer need the object B, there is no need you can say that memory occupied by B is free to be used - single stack pointer simply does not have capabilities to do so.

That limits the usage of the stack memory to the cases of 'close-reach', short-lived objects - when you you know that you do not need to selectively free any memory associated with objects allocated within this scope, and can simply free all of them soon enough. This make stack memory an ideal storage for a variables defined within a function - all of them are freed together when the function exits. What's even better is that compiler can do this automatically for you - you do not have to explicitly tell the compiler when to free the memory for each variable - it is going to be freed automatically once the code execution left it's scope.

It is also worth noting that stack allocation and freeing are uberfast - they only require a single register arithmetic operation.

However, as I said before, stack has limitations. Heap memory is here to overcome those - and will be described next.

Heap

Unlike the stack (which is only managed by simple register) heap memory is supported by complex structures and logic. You can request memory from the heap, and you can return memory back to the heap, and you can do it independently for every object. So, going back to my original example, when you requested memory for objects A, B and C (all the same size), and no longer need object B, you can return memory for B and still retain A and C. If you need to create another object, D, of the same size as those before and ask for the memory for it, heap can give you memory you returned from B. While it is not guaranteed (heap algorithms are very complex) this is a good enough simplification.

Unlike stack memory, managing heap memory has it's costs, which are actually comparatively quite high (especially in multithreaded environment). That's why heap memory should not be used if one can help it, but this is a huge topic on it's own, which I am not going to dwell on now.

One very important property of the heap memory is that it has to be explicitly managed by the user. You need to request memory when you need it, give it back when you no longer need it, and never use the memory you've given back. Failure to observe those rules would either make your program leak memory - that is, consume memory without giving it back, which would cause the program to eventually run out of memory - in case you do not give memory back; or cause the program to behave incorrectly (if you use the memory before requesting or after giving back) as you will be accessing memory which is not yours.

C/C++ memory model

For better or worse, C/C++ shield the programmer from those low-level memory concepts. Instead, the language specifies that every variable lives in a certain type of storage, and it's lifetime is defined by the storage type. There are 3 types of storage, outlined below.

Automatic storage

This storage is managed by the compiler 'automatically' (hence the name) and does not require the programmer to do anything about it. An example of automatic variable is one defined inside a function body:

void foo() {
int a;
}

a here is automatic. You do not need to worry about allocating memory for it or cleaning it when it is no longer needed, and compiler guarantee you that it will be there when you enter function foo(), and will no longer be there when you exit foo(). While it might be allocated on the stack, there is absolutely no guarantee about it - it might as well be put in the register. Registers are so much faster than any memory, so compilers will make use of them whenever they can.

Static storage

Variables put in static storage live until the program exits. Again, developer does not need to worry about their lifetime, or cleaning up the memory - the memory will be cleaned up after program exits, and not before. An example of static duration variable is a variable, defined outside of any function (global variable), static local variables of the function, and static members of the class. In the code below var1, var2 and var3 are all variables within static storage:

Code (with some inline comments):

int var1;

void foo() {
static int var2;
}

class A {
static int var3;
}

Dynamic storage

Dynamic storage variables are controlled by developer. When you need them, you request the memory (usually with malloc in C or new in C++) and you must give it back when you no longer need it (with free in C, delete in C++). As a developer, you should be paying all attention in how you allocate, use and delete those, and make sure the sequence is never broken. Failure to observe the sequence is a single major cause of all the great program bugs making the news :). Luckily, C++ has special features and classes for you that simplify this task, but if you develop in C, you are on your own. In the example below, memory to where var4 points is dynamically allocated.

Code:

void foo() {
int* var4;
// Here is the major source of confusion. var4 itself is **automatic**
// you do not need to allocate or free var4 memory, so you can use it
// like this:
var4 = NULL; // Not an error!!!
// However, you can't use the memory var4 points to yet!
// Following line would cause incorrect behavior of the program:
// *var4 = 42; // NEVER EVER!!!
// Instead, you need to allocate the memory first (let's assume, we are in C++
var4 = new int();
// Now the memory was allocated, we can use it
*var4 = 42; // Correct!
// we no longer need this memory, so let's free it:
delete var4;
// This did not change var4 itself (unless there is a special case)
// so technically, it still points to the memory which was former
// belonging to you. But the memory is no longer yours!!!
// you can't read or write it!
// Following code is bad-bad-bad:
// int x = *var4; // NEVER EVER!
}

As you've seen, using dynamic memory comes with most caution and warning signs. This is why in C++ there are special facilities to make this easier, and no one is expected to write the code I have wrote above. However, my post is already way to long, so proper memory management in C++ will be left for another occasion :)

Huge memory allocation: stack vs heap

The size of the stack is usually around one to few megabytes by default on typical desktop systems. Probably less on embedded devices.

If you allocate more memory than fits on the stack, the operating system will typically terminate the program as soon as you attempt to access the memory.

Does it mean it's recommended to story big amount of data in the heap?

It is recommended to use the free store (dynamic allocation) for big amount of data, because big amount of data would overflow the stack.

Application Manager i saw it is not using that amount of memory (just few KBs).

Typically, an operating system allocates a page of memory for a process when that memory is accessed. Since your program didn't crash due to stack overflow, I suspect that you never accessed the memory, and therefore no memory was allocated for the data.



Related Topics



Leave a reply



Submit