How Is Heap and Stack Memories Managed, Implemented, Allocated

How is heap and stack memories mananged, implemented, allocated

I think to your question one can easily write at least some chapters for the book on Operating Systems. I suggest you to read Tanenbaum: Modern Operating Systems.

Main difference of heap and stack, that one is per process item, the other per thread item. Initially when program is started it gets some minimal heap and some stack segment. Heap is grown, stack is static (for each thread). If you write a recursive function which does not terminate (endless recursion) you will get stack overflow ;) Any function call has a stack frame on stack segment, when the function leaves, the stack is unwound and frame is free to be used by the next function. Stack is a continuous linear structure. On Linux you can configure the stack segment size for a process via an environment variable. On windows (at least with MS Visual C++) you can pass a linker flag with the size of stack segment. Stack overflows can also be produced when allocating at compile time some big array:

char test[1000000];

Heap is a different story. When a process starts up heap size is some default value and can vary form OS to OS or configuration being used on that OS (e.g. on Windows it is 2MB by default, as far as I remember it). Further, if you need more heap, to allocate more space for variables etc. it will grow. If the program doesn't free heap memory it runs out of it (or heap space). There are different data structures for heap implementation some of them are binary tree derivatives, some are not e.g. Fibonacci Heap (forrest of trees). You can read some articles etc. on how to write a memory allocator. These data structures must be optimized for finding the heap node when an allocated chunk needs to be de-allocated, or appending (finding a free chunk) when new heap space is needed.

Each process on a 32 bit OS has 4GB of virtual address space. As you can imagine there can't be so much RAM where all processes with their 4GBs of virtual address space fit. OS memory is organized in pages, which are swapped to HD when no longer needed or expired. This is where paging comes to play. Everything is mapped to pages: a process with the stack or the growing heap. Due to the structure of heap that it grows dynamically, it can be placed on multiple pages. This is why heap access can be very expensive, because if the page is not in memory a page fault happens and OS has to load a page from the disk (and that can be by magnitude slower). Stack frame of the thread being executed is in processor cache, which is much faster as RAM.

Different heap types are possible, there might be heaps which are very fast for small objects or heaps which are very efficient in multi-threaded environments. Alexandrescu describes in "Modern C++ Design" how to develop small object allocator and a heap which manages small objects. This implementation is available in his Loki C++ library. Some embedded systems offer physically different memory regions, where different heap types can be implemented ontop. To write an own allocator (heap manager etc.) is a hard job if you want to beat a compiler.

Regards,

Ovanes

What and where are the stack and heap?

The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).

To answer your questions directly:

To what extent are they controlled by the OS or language runtime?

The OS allocates the stack for each system-level thread when the thread is created. Typically the OS is called by the language runtime to allocate the heap for the application.

What is their scope?

The stack is attached to a thread, so when the thread exits the stack is reclaimed. The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.

What determines the size of each of them?

The size of the stack is set when a thread is created. The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system).

What makes one faster?

The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation. Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast. Another performance hit for the heap is that the heap, being mostly a global resource, typically has to be multi-threading safe, i.e. each allocation and deallocation needs to be - typically - synchronized with "all" other heap accesses in the program.

A clear demonstration:
Sample Image

_{Image source: vikashazrati.wordpress.com}

How is heap usually implemented?

The OS allocates pages, which it returns to malloc/free, which then break those pages up into blocks of the requested size, from memory. The OS can allocate any pages in the user's address space that aren't already requested. There is no heap segment. The allocated memory is in whatever location the OS determines.

Why did heap memory come into existence?

With stack allocation there is a strict LIFO (last-in, first-out) policy for the lifetimes of allocated objects. That is, if you allocate object A, then object B, then object C, they will always be freed in the opposite order.

That's useful in a lot of cases, but there are also cases where you e.g. want object C to continue to exist after B and A have been destroyed. There's no way to do that with a stack, so to handle those cases the heap was introduced. It's harder to use correctly than stack allocation, but also more flexible/powerful.

C++ stack vs heap allocation

Use automatic (stack) allocation whenever the function scope - or the scope of a control block such as a for, while, if etc. inside the function - is a good match for the lifetime the object needs. That way, if the object owns/controls any resources, such as dynamically allocated memory, file handles etc. - they will be freed during the destructor call as that scope is left. (Not at some unpredictable later time when a garbage collector winds up).

Only use new if there's a clear need, such as:

needing the object to live longer than the function scope,
to hand over ownership to some other code
to have a container of pointers to base classes that you can then process polymorphically (i.e. using virtual dispatch to derived-class function implementations), or
an especially large allocation that would eat up much of the stack (your OS/process will have "negotiated" a limit, usually in the 1-8+ megabyte range)
- if this is the only reason you're using dynamic allocation, and you do want the lifetime of the object tied to a scope in your function, you should use a local std::unique_ptr<> to manage the dynamic memory, and ensure it is released no matter how you leave the scope: by return, throw, break etc.. (You may also use a std::unique_ptr<> data member in a class/struct to manage any memory the object owns.)

Mathieu Van Nevel comments below about C++11 move semantics - the relevance is that if you have a small management object on the stack that controls a large amount of dynamically allocated (heap) memory, move semantics grant extra assurances and fine-grained control of when the management object hands over its resources to another management object owned by other code (often the caller, but potentially some other container/register of objects). This handover can avoid data on the heap being copied/duplicated even momentarily. Additionally, elision and return-value-optimisation often allow nominally automatic/stack-hosted variables to be constructed directly in some memory they're eventually being assigned/returned-to, rather than copied there later.

Who is responsible for the stack and heap in C++?

Who or what is responsible for the invention of the stack and heap?

As far as inventing a stack and heap, you would have better luck searching the web. Those concepts have been around for many decades.

Are these inventions of the C++ compiler?

Perhaps invention is the wrong term here. They are data structures. The compiler and OS (if present) are in charge of organizing and utilizing memory.

Does the os specify memory sections in RAM designated "stack" and "heap"?

This is OS specific and can vary by OS. Some OSes reserve stack and heap areas, others don't.

In the embedded system I am working on, there are two heap areas: 1) The area specified in the linker, 2) A portion of the memory allocated to the OS. Both of these areas are set to zero size, so we don't have any heaps.

The stack areas are set up by initialization code that runs before the C language Run-Time Library is initialized. The RTL code may also create some stack areas as well. Our RTOS also creates stack areas (one for each task).

So, there is not one single area called the stack. Some platform's don't use a stack concept at all (especially those whose memory capacity is severely restricted).

I'm pretty sure they are not built into the hardware but I could be wrong.

Depends on the hardware. Simple and cheap hardware only allocates an area of RAM (read/write memory). More complex and expensive hardware may allocate separate areas for stacks, heaps, executables and data. Constants may be placed into ROM (read-only memory, such as Flash). There is no one-size or one-configuration that supports everything. Desktop computers are different animals than smaller embedded systems.

Also, is the compiler responsible for generating assembly code that specify which local or function data will be stored on the stack vs CPU registers?

The task can be in the Linker or Compiler or both.

Many compiler tool-chains utilize both stack and CPU registers. Many variables and data can be on the stack, in registers, in RAM or in ROM. A compiler is designed to make best use of the platform's resources, including memory and registers.

A good example to study is the assembly language generated by your compiler. Also look at the linker instruction file as well. The use of registers or stack memory is so dependent on the data structures (and types) that it may be different for different functions. Another factor is the amount of memory and kind available. If the processor has few registers available, the compiler may pass variables using the stack. Larger data (that doesn't fit in a register) may be passed on the stack or a pointer passed to the data. There are too many options and combinations available to enumerate here.

Summary

In order for the C and C++ languages to be very portable, many concepts are delegated to the implementation (compiler / toolchain). Two of these concepts are commonly referred to as stack and heap. The C and C++ language standards use a simple model as the environment for the languages. Also, there are terms such as "hosted" and "semihosted" which indicate the degree that a platform supports the language requirements. The stack and heap are data structures not required by the platform in order to support the languages. They do assist in the efficiency of the implementation.

If stacks and heaps are supported, their location and management is the responsibility of the implementation (toolchain). A compiler is free to use it's own memory management functions or the OS (if present). The management of the stacks and heaps may require hardware support (such as virtual memory management or paging; and fences). There is no requirement for the stack to grow towards the heap. There is no requirement for the stack to grow in a positive direction. These are all up to the implementation (toolchain), and they can implement and locate stacks however and wherever they like. Note: most likely, they won't place variables in read-only memory and won't locate stacks outside memory capacity.

Java stack and heap memory management

Where are the methods of s stored?

They are stored in the String class object; it is an object loaded by a ClassLoader object when String is first referenced in the program. All the implementations of the JVM that existed when I read about this last never deallocated the memory for a class object once it was loaded. It's on the heap.

Had I created another object of MemoryClass inside myMethod, would JVM allocate memory for the same methods again inside the stack memory?

No, methods and data for objects is kept separately, specifically because the JVM never needs more than one copy of the methods.

Would JVM free the memory allocated to myMethod as soon as it's execution is completed, if so, how would it manage the situation mentioned in question 2(only applicable if JVM allocates memory multiple times to the same method).

No. Java doesn't generally "immediately free memory" of things stored on the heap. It would make things run too slowly. It only frees memory when the garbage collector runs, and it does that only when its algorithm for running the garbage collector decides it is time.

What would have been the case, if I had only declared s and did not initialize it, would JVM still allocate memory to all the methods of java.lang.String class,if so,why?

This depends on the JVM implementation, I think, and maybe the compiler. If you declare a variable and never use it, it is quite possible (and common) for the compiler to notice that there is no use for it and to not put it into the class file. If it isn't in the class file, it is never referenced, and therefore it and its methods are not loaded, etc. If the compiler puts it in anyway but it is never referenced, then the ClassLoader wouldn't have any reason to load it, but I'm a little vague on whether it would get loaded or not. Might depend on the JVM implementation; does it load things because there are variables of the class or only when they are referenced? How many ClassLoader algorithms can dance on the head of a 4-digit PIN?

I encourage you to read about the JVM and ClassLoaders and such; you will gain so much more by reading an explanation of how it works rather than poking at it with examples you can think up.

What decides where on the heap memory is allocated?

The memory is managed by the OS. So the answer depends on the OS/Plattform that is used. The C++ specification does not specify how memory on a lower level is allocated/freed, it specifies it in from of the lifetime.

While multi-user desktop/server/phone OS (like Windows, Linux, macOS, Android, …) have similar ways to how memory is managed, it could be completely different on embedded systems.

What is in control of choosing where on the heap memory is stored and how does it choose that?

Its the OS that is responsible for that. How exactly depends - as already said - on the OS. The OS could also be a thin layer in the form of a combination of the runtime library and minimal OS like includeos

Does this mean that the heap is shared between all processes?

Depends on the point of view. The address space is - for multiuser systems - in general not shared between processes. The OS ensures that one process cannot access memory of another process, which is ensured through virtual address spaces. But the OS can distribute the whole RAM among all processes.

For embedded systems, it could even be the case, that each process has a fixed amount a preallocated memory - that is not shared between processes - and with no way to allocated new memory or free memory. And then it is up to the developer to manage that preallocated memory by themselves by providing custom allocators to the objects of the stdlib, and to construct in allocated storage.

I want to learn more about memory fragmentation

There are two ways of fragmentation. The one is given by the memory addresses exposed by the OS to the C++ runtime. And the one on the hardware/OS side (which could be the same for embedded system) . How and in which form the memory might be fragmented organized by the OS can't be determined using the function provided by the stdlib. And how the fragmentation of the address spaces of the process behaves, depends again on the os and the also on the used stdlib.