Proper Stack and Heap Usage in C++

Proper stack and heap usage in C++?

No, the difference between stack and heap isn't performance. It's lifespan: any local variable inside a function (anything you do not malloc() or new) lives on the stack. It goes away when you return from the function. If you want something to live longer than the function that declared it, you must allocate it on the heap.

class Thingy;

Thingy* foo( ) 
{
  int a; // this int lives on the stack
  Thingy B; // this thingy lives on the stack and will be deleted when we return from foo
  Thingy *pointerToB = &B; // this points to an address on the stack
  Thingy *pointerToC = new Thingy(); // this makes a Thingy on the heap.
                                     // pointerToC contains its address.

  // this is safe: C lives on the heap and outlives foo().
  // Whoever you pass this to must remember to delete it!
  return pointerToC;

  // this is NOT SAFE: B lives on the stack and will be deleted when foo() returns. 
  // whoever uses this returned pointer will probably cause a crash!
  return pointerToB;
}

For a clearer understanding of what the stack is, come at it from the other end -- rather than try to understand what the stack does in terms of a high level language, look up "call stack" and "calling convention" and see what the machine really does when you call a function. Computer memory is just a series of addresses; "heap" and "stack" are inventions of the compiler.

What and where are the stack and heap?

The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).

To answer your questions directly:

To what extent are they controlled by the OS or language runtime?

The OS allocates the stack for each system-level thread when the thread is created. Typically the OS is called by the language runtime to allocate the heap for the application.

What is their scope?

The stack is attached to a thread, so when the thread exits the stack is reclaimed. The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.

What determines the size of each of them?

The size of the stack is set when a thread is created. The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system).

What makes one faster?

The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation. Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast. Another performance hit for the heap is that the heap, being mostly a global resource, typically has to be multi-threading safe, i.e. each allocation and deallocation needs to be - typically - synchronized with "all" other heap accesses in the program.

A clear demonstration:
Sample Image

_{Image source: vikashazrati.wordpress.com}

C: Is my understanding about the specifics of heap and stack allocation correct?

My understanding is that calling malloc() sets aside some memory on the heap, and then returns the address of that memory (pointing to the beginning)…

Yes, but people who call it “the heap” are being sloppy with terminology. A heap is a kind of data structure, like a linked list, a binary tree, or a hash table. Heaps can be used for things other than tracking available memory, and available memory can be tracked using data structures other than a heap.

I do not actually know of a specific term for the memory that the memory management routines manage. There are actually several different sets of memory we might want terms for:

all the memory they have acquired from the operating system so far and are managing, including both memory that is currently allocated to clients and memory that has been freed (and not yet returned to the operating system) and is available for reuse;
the memory that is currently allocated to clients;
the memory that is currently available for reuse; and
the entire range of memory that is being managed, including portions of the virtual address space reserved for future mapping when necessary to request more memory from the operating system.

I have seen “pool” used to describe such memory but have not seen a specific definition of it.

… which is assigned to struct node *newnode which is itself on the stack.

struct node *newnode is indeed nominally on the stack in common C implementations. However, the C standard only classifies it as automatic storage duration, meaning its memory is automatically managed by the C implementation. The stack is the most common way to implement that, but specialized C implementations may do it in other ways. Also, once the compiler optimizes the program, newnode might not be on the stack; the compiler might generate code that just keeps it in a register, and there are other possibilities too.

A complication here is when we are talking about memory use in a C program, we can talk about the memory use in a model computer the C standard uses to describe the semantics of programs or the memory use in actual practice. For example, as the C standard describes it, every object has some memory reserved for it during its lifetime. However, when a program is compiled, the compiler can produce any code it wants that gets the same results as required by the C standard. (The output of the program has to be the same, and certain other interactions have to behave the same.) So a compiler might not use memory for an object at all. After optimization, an object might be in memory at one time and in registers at another, or it might always be in a register and never in memory, and it might be in different registers at different times, and it might not be any particular place because it might have been incorporated into other things. For example, in int x = 3; printf("%d\n", 4*x+2);, the compiler might eliminate x completely and just print “14”. So, when asking about where things are in memory, you should be clear about whether you want to discuss the semantics in the model computer that the C standard uses or the actual practice in optimized programs.

When the function is first called, *nodetoaddto is a pointer to struct node first, both of which are on the stack.

nodetoaddto may be on the stack, per above, but it also may be in a register. It is common that function arguments are passed in registers.

It points to a struct node. By itself, struct node is a type, so it is just a concept, not an object to point to. In contrast, “a struct node” is an object of that type. That object might or might not be on the stack; addnodeto would not care; it could link to it regardless of where it is in memory. Your main routine does create its first and last nodes with automatic storage duration, but it could use static just as well, and then the nodes would likely be located in a different part of memory rather than the stack, and addnodeto would not care.

Thus the (*nodeaddto)->next = newnode sets first.next equal to the value of newnode which is the address of the newly allocated memory.

Yes: In main, last is initialized to pointer to first. Then &last is passed to addnodeto, so nodeaddto is a pointer to last. So *nodeaddto is a pointer to first. So (*nodeaddto)->next is the next member in `first.

When we leave this function, and continue executing the main() function, is *newnode removed from the stack (not sure if 'deallocated' is the correct word), leaving only struct node first pointing to the 'next' node struct on the heap?

newnode is an object with automatic storage duration inside addnodeto, so its memory is automatically released when addnodeto ends.

*newnode is a struct node with allocated storage duration, so its memory is not released when a function ends. Its memory is released when free is called, or possibly some other routine that may release memory, like realloc.

If so, does this 'next' struct node have a variable name also on the stack or heap, or it is merely some memory pointed [to]?

There are no variable names in the stack or in the heap. Variable names exist only in source code (and in the compiler while compiling and in debugging information associated with the compiled program, but that debugging information is generally separate from the normal execution of the program). When we work with allocated memory, we generally work with it only by pointers to it.

Moreover, is it true to say that struct node first is on the stack, whilst all subsequent nodes will be on the heap,…

Yes, subject to the caveats about stack and “heap” above.

… and that just before main() returns 0 there are no structs/variables on the stack other than struct node first?

All of the automatic objects in main are on the stack (or otherwise automatically managed): first, last, and num_nodes.

Or is/are there 1/more than 1 *newnode still on the stack?

No.

Size of stack and heap memory

"Stack is created in the top-level-address and the heap at the
low-level-address" Please elobarate this

This is a myth. It may have a basis in historical truth. It might sometimes resonate with things you see in real life. But it is not literally true.

It's easy enough to explore, though:

#include <stdlib.h>
#include <stdio.h>

void check(int depth) {
    char c;
    char *ptr = malloc(1);
    printf("stack at %p, heap at %p\n", &c, ptr);
    if (depth <= 0) return;
    check(depth-1);
}

int main() {
    check(10);
    return 0;
}

On my machine I see:

stack at 0x22ac3b, heap at 0x20010240
stack at 0x22ac0b, heap at 0x200485b0
stack at 0x22abdb, heap at 0x200485c0
stack at 0x22abab, heap at 0x200485d0
stack at 0x22ab7b, heap at 0x200485e0
stack at 0x22ab4b, heap at 0x200485f0
stack at 0x22ab1b, heap at 0x20048600
stack at 0x22aaeb, heap at 0x20048610
stack at 0x22aabb, heap at 0x20048620
stack at 0x22aa8b, heap at 0x20048630
stack at 0x22aa5b, heap at 0x20048640

So, the stack is going downwards and the heap is going upwards (as you might expect based on the myth), but the stack has the smaller address, and they are not growing toward each other (myth busted).

Btw, my check function is tail-recursive, and on some implementations with some compiler options you might see the stack not moving at all. Which tells you something about why the standard doesn't mandate how all this works -- if it did it might inadvertently forbid useful optimizations.

When is it best to use the stack instead of the heap and vice versa?

Use the stack when your variable will not be used after the current function returns. Use the heap when the data in the variable is needed beyond the lifetime of the current function.

C best practices, stack vs heap allocation

You should use malloc if:

You're passing pointers to non-static const data up the call stack, or
you're allocating variable or simply large amounts of data (else you risk stack overflow).

In other cases, stack allocation should be fine.

Does C need a stack and a heap in order to run?

No, it does not. Let's cover the heap first, that's easy.

An implementation that does not provide a heap of any sort just needs to return NULL whenever you try to call malloc (or any other memory allocation function). That's perfectly acceptable behaviour according to the standard.

In terms of the stack, it also doesn't need to provide one. ISO C11 mentions the word "stack" exactly zero times.

What an implementation does need to do is simply be a correct "virtual machine" for all the things specified in the standard. Granted that will be very difficult without a stack but it's not impossible. As an extreme case, there's nothing that says you can't simply inline every single function call recursively. That would use rather a large amount of code and function-specific data space, but it's certainly doable.

However, it's probably something that would convince me to move to another architecture, one that did have a stack (and heap, for that matter).

Having said that, even if an architecture provides neither a heap nor a stack, both of those can be built out of basic memory I/O operations. In fact, one of the earliest computers I ever had as a teen sported an RCA 1802 CPU which had no dedicated stack. It didn't even have a call or ret instruction.

Yet it could handle subroutines and a stack quite well (for some definition of the word "well") using its SCRT (standard call and return technique). See here for some more detail on how this thing of beauty (or monstrosity, depending on your viewpoint) worked, along with some other unusual architectures.

The IBM Z (a.k.a. System z, zSeries, whatever they're calling it this week) actually has a heap (of sorts, in that you can allocate memory from the OS) but no stack. It actually implements a linked-list stack by using this heap memory along with certain registers (similar to the RCA chip referenced in the above link), meaning that a function prolog allocates local function memory using STORAGE OBTAIN and the epilog releases it with STORAGE RELEASE.

Needless to say that puts quite a bit of extra code into the prolog and epilog for each function.

Global memory management in C++ in stack or heap?

Since I wasn't satisfied with the answers, and hope that the sameer karjatkar wants to learn more than just a simple yes/no answer, here you go.

Typically a process has 5 different areas of memory allocated

Code - text segment
Initialized data – data segment
Uninitialized data – bss segment
Heap
Stack

If you really want to learn what is saved where then read and bookmark these:

COMPILER, ASSEMBLER, LINKER AND LOADER: A BRIEF STORY (look at Table w.5)

Anatomy of a Program in Memory

alt text

Proper Stack and Heap Usage in C++