Are Global Variables in C++ Stored on the Stack, Heap or Neither of Them

Are global variables in C++ stored on the stack, heap or neither of them?

Here is what the book says on page 205:

If you’re familiar with operating system architecture, you might be interested to know that local variables and function arguments are stored on the stack, while global and static variables are stored on the heap.

This is definitely an error in the book. First, one should discuss storage in terms of storage duration, the way C++ standard does: "stack" refers to automatic storage duration, while "heap" refers to dynamic storage duration. Both "stack" and "heap" are allocation strategies, commonly used to implement objects with their respective storage durations.

Global variables have static storage duration. They are stored in an area that is separate from both "heap" and "stack". Global constant objects are usually stored in "code" segment, while non-constant global objects are stored in the "data" segment.

Where in memory are my variables stored in C?

You got some of these right, but whoever wrote the questions tricked you on at least one question:

  • global variables -------> data (correct)
  • static variables -------> data (correct)
  • constant data types -----> code and/or data. Consider string literals for a situation when a constant itself would be stored in the data segment, and references to it would be embedded in the code
  • local variables(declared and defined in functions) --------> stack (correct)
  • variables declared and defined in main function -----> heap also stack (the teacher was trying to trick you)
  • pointers(ex: char *arr, int *arr) -------> heap data or stack, depending on the context. C lets you declare a global or a static pointer, in which case the pointer itself would end up in the data segment.
  • dynamically allocated space(using malloc, calloc, realloc) --------> stack heap

It is worth mentioning that "stack" is officially called "automatic storage class".

C++ variables and where they are stored in memory (stack, heap, static)

Mostly right.

Any variable that is accessed with a pointer is stored on the heap.

This isn't true. You can have pointers to stack-based or global variables.

Also it's worth pointing out that global variables are generally unified by the linker (i.e. if two modules have "int i" at global scope, you'll only have one global variable called "i"). Dynamic libraries complicate that slightly; on Windows, DLLs don't have that behaviour (i.e. an "int i" in a Windows DLL will not be the same "int i" as in another DLL in the same process, or as the main executable), while most other platforms dynamic libraries do. There are some additional complications on Darwin (iOS/macOS) which has a hierarchical namespace for symbols; as long as you're linking with the flat_namespace option, what I just said will hold.

Additionally, it's worth talking about initialisation behaviour; global variables are initialised automatically by the runtime (typically either using special linker features or by means of a call that is inserted into the code for your main function). The order of initialisation of globals isn't guaranteed. However, static variables declared at function scope are initialised when that function is first executed, and not at program start-up as you might suppose, and that feature is commonly used by C++ programmers to do lazy initialisation.

(Similar concerns apply to destructors for global objects; those are best avoided entirely IMO, not least because on some platforms there are fast termination features that simply won't call them.)

const keyword means you can't change the variable.

Almost. const affects the type, and there is a difference depending on where you write it exactly. For example

const char *foo;

should be read as foo is a pointer to a const char, i.e. foo itself is not const, but the thing it points at is. Contrast with

char * const foo;

which says that foo is a const pointer to char.

Finally, you've missed out volatile, the point of which is to tell the compiler not to make assumptions about the thing to which it applies (e.g. it can't assume that it's safe to cache a volatile value in a register, or to optimise away accesses, or in general to optimise across any operation that affects a volatile value). Hopefully you'll never need to use volatile; it's most often useful if you're doing really low-level things that frankly a lot of people have no need to go anywhere near.

Global memory management in C++ in stack or heap?

Since I wasn't satisfied with the answers, and hope that the sameer karjatkar wants to learn more than just a simple yes/no answer, here you go.

Typically a process has 5 different areas of memory allocated

  1. Code - text segment
  2. Initialized data – data segment
  3. Uninitialized data – bss segment
  4. Heap
  5. Stack

If you really want to learn what is saved where then read and bookmark these:

COMPILER, ASSEMBLER, LINKER AND LOADER: A BRIEF STORY (look at Table w.5)

Anatomy of a Program in Memory

alt text

Static and global variable in memory

Variables stored on the stack are temporal in nature. They belong to a function, etc and when the function returns and the corresponding stack frame is popped off, the stack variables disappear with it. Since globals are designed to be accessible everywhere, they must not go out of context and thus are stored on the heap (or in a special data section of the binary) instead of on the stack. The same goes for static variables; since they must hold their value between invocations of a function, they cannot disappear when the function returns thus they cannot be allocated on the stack.

As far as protection of static variables goes, IIRC this is mainly done by the compiler. Even though the variable is on the heap, your compiler knows the limited context in which that variable is valid and any attempt to access the static from outside that context will result in an "unknown identifier" or similar error. The only other way to access the heap variable incorrectly is if you know the address of the static and you blindly de-reference a pointer to it. This should result in a run-time memory access error.

In a multi-threaded environment, it is still okay to use globals and static variables. However, you have to be a lot more careful. You must guarantee that only one thread can access the variable at a time (typically through some kind of locking mechanism such as a mutex). In the case of static local variables inside a function, you must ensure that your function will still function as expected if it is called from multiple threads sequentially (that is, called from thread 1, then from thread 2, then thread 1, then thread 2, etc etc). This is generally harder to do and many functions that rely on static member variables are not thread-safe because of this (strtok is a notable example).

Allocated memory is in the stack or heap

Possibly neither. The precise terms are

  • static storage: for data that exists as long as the process exists;
  • automatic storage: for data that is allocated and fred as the process enters/exits different scopes;
  • dynamic storage: for data that must be explicitly requested and exists until it is explicitly fred.

Usually automatic memory lives in the stack and dynamic storage lives in the heap. But the compiler is completely free to implement all those storage types in whatever way they want, as long as it respects the rules for lifespan.

So:

static vector1Int hello;

is in the file scope and creates an object of type vector1Int in static storage.

And this

hello = vector1Int(8,12);

will cause std::vector to create room for at least 8 integers. We can usually assume that this will be taken from dynamic storage. However, this is not a rule. For instance, you could easily make std::vector use static or automatic memory by implementing your own allocator (not general purpose memory allocator, but STL allocator).

When your program reaches the end of the main function, the destructor of std::vector will be called for hello, and any dynamic memory that hello had requested will be given back to the memory manager.

The memory for the object hello itself is not fred because it is static. Instead, it is given back to the OS together with anything else the process used when the process terminates.

Now, if hello had been declared as a local variable of create, then the destructor would be called at the end of that function. In that case, hello would have been allocated at automatic storage, and would be fred at the end of create.

What and where are the stack and heap?

The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).

To answer your questions directly:

To what extent are they controlled by the OS or language runtime?

The OS allocates the stack for each system-level thread when the thread is created. Typically the OS is called by the language runtime to allocate the heap for the application.

What is their scope?

The stack is attached to a thread, so when the thread exits the stack is reclaimed. The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.

What determines the size of each of them?

The size of the stack is set when a thread is created. The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system).

What makes one faster?

The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation. Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast. Another performance hit for the heap is that the heap, being mostly a global resource, typically has to be multi-threading safe, i.e. each allocation and deallocation needs to be - typically - synchronized with "all" other heap accesses in the program.

A clear demonstration:
Sample Image

Image source: vikashazrati.wordpress.com



Related Topics



Leave a reply



Submit