How to Find If a Variable Is Allocated in Stack or Heap

How to find if a variable is allocated in stack or heap?

No, not in general.

Do you know of gcc -fsplit-stack ?

It is up to the implementation to decide whether to allocate a contiguous stack or a stack where blocks are interleaved with heap blocks in memory. Good luck figuring out whether a block was allocated for the heap or the stack when the latter is split.

How does a compiler know if something is allocated on the stack or heap?

A compiler builds a syntax tree from which it is able to analyze each part of the source code.

It builds a symbol table which associates to each symbol defined some information. This is required for many aspects:

  • finding undeclared identifiers
  • checking that types are convertible
  • so on

Once you have this symbol table it is quite easy to know if you are trying to return the address of a local variable since you end up having a structure like

ReturnStatement
+ UnaryOperator (&)
+ Identifier (z)

So the compiler can easily check if the identifier is a local stack variable or not.

Mind that this information could in theory propagate along assignments but in practice I don't think many compilers do it, for example if you do

int* something() {
int z = 21;
int* pz = &z;
return pz;
}

The warning goes away. With static code flow analysis you could be able to prove that pz could only refer to a local variable but in practice that doesn't happen.

How to determine if returned pointer is on the stack or heap

Distinguishing between malloc/free and new/delete is generally not possible, at least not in a reliable and/or portable way. Even more so as new simply wrapps malloc anyway in many implementations.

None of the following alternatives to distinguish heap/stack have been tested, but they should all work.

Linux:

  1. Solution proposed by Luca Tettananti, parse /proc/self/maps to get the address range of the stack.
  2. As the first thing at startup, clone your process, this implies supplying a stack. Since you supply it, you automatically know where it is.
  3. Call GCC's __builtin_frame_address function with increasing level parameter until it returns 0. You then know the depth. Now call __builtin_frame_address again with the maximum level, and once with a level of 0. Anything that lives on the stack must necessarily be between these two addresses.
  4. sbrk(0) as the first thing at startup, and remember the value. Whenever you want to know if something is on the heap, sbrk(0) again -- something that's on the heap must be between the two values. Note that this will not work reliably with allocators that use memory mapping for large allocations.

Knowing the location and size of the stack (alternatives 1 and 2), it's trivial to find out if an address is within that range. If it's not, is necessarily "heap" (unless someone tries to be super smart-ass and gives you a pointer to a static global, or a function pointer, or such...).

Windows:

  1. Use CaptureStackBackTrace, anything living on the stack must be between the returned pointer array's first and last element.
  2. Use GCC-MinGW (and __builtin_frame_address, which should just work) as above.
  3. Use GetProcessHeaps and HeapWalk to check every allocated block for a match. If none match for none of the heaps, it's consequently allocated on the stack (... or a memory mapping, if someone tries to be super-smart with you).
  4. Use HeapReAlloc with HEAP_REALLOC_IN_PLACE_ONLY and with exactly the same size. If this fails, the memory block starting at the given address is not allocated on the heap. If it "succeeds", it is a no-op.
  5. Use GetCurrentThreadStackLimits (Windows 8 / 2012 only)
  6. Call NtCurrentTeb() (or read fs:[18h]) and use the fields StackBase and StackLimit of the returned TEB.

How to find out if the memory belongs to heap or stack?

In general? No.

Not every implementation has a heap, or has one used by malloc(). And not every local variable is on a "stack". These are hard implementation details.

It may be possible, using the documentation for your specific system, to determine a ruleset to satisfy your goal, but since you are programming in C++ it would be much better to not do this at all. Instead, focus on the high-level semantics of your program. Let the compiler and implementation take care of the rest; indeed, that is their job.

What and where are the stack and heap?

The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.

The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).

To answer your questions directly:

To what extent are they controlled by the OS or language runtime?

The OS allocates the stack for each system-level thread when the thread is created. Typically the OS is called by the language runtime to allocate the heap for the application.

What is their scope?

The stack is attached to a thread, so when the thread exits the stack is reclaimed. The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.

What determines the size of each of them?

The size of the stack is set when a thread is created. The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system).

What makes one faster?

The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation. Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast. Another performance hit for the heap is that the heap, being mostly a global resource, typically has to be multi-threading safe, i.e. each allocation and deallocation needs to be - typically - synchronized with "all" other heap accesses in the program.

A clear demonstration:
Sample Image

Image source: vikashazrati.wordpress.com

Determining variables stack or heap in C?

All your variables have automatic scope. They come from the "stack", in that the variables are no longer valid once the function returns.

Named function variables can never come from the "heap" in the sense that you mean it. The memory for a named function variable is always tied to the function scope (or the innermost block scope within the function in which the variable is declared).

A variable can be assigned a value obtained by malloc() or similar dynamic allocation function. The variable then points to an object that exists in the "heap". However, the named pointer variable itself is not in the "heap".

Sometimes the "stack" itself is dynamically allocated. Such as for a thread. Then, the memory used to allocate function local variables running within that thread is in the "heap". However, the variables themselves are still automatic, in that they are invalid once the function returns.

Allocated memory is in the stack or heap

Possibly neither. The precise terms are

  • static storage: for data that exists as long as the process exists;
  • automatic storage: for data that is allocated and fred as the process enters/exits different scopes;
  • dynamic storage: for data that must be explicitly requested and exists until it is explicitly fred.

Usually automatic memory lives in the stack and dynamic storage lives in the heap. But the compiler is completely free to implement all those storage types in whatever way they want, as long as it respects the rules for lifespan.

So:

static vector1Int hello;

is in the file scope and creates an object of type vector1Int in static storage.

And this

hello = vector1Int(8,12);

will cause std::vector to create room for at least 8 integers. We can usually assume that this will be taken from dynamic storage. However, this is not a rule. For instance, you could easily make std::vector use static or automatic memory by implementing your own allocator (not general purpose memory allocator, but STL allocator).

When your program reaches the end of the main function, the destructor of std::vector will be called for hello, and any dynamic memory that hello had requested will be given back to the memory manager.

The memory for the object hello itself is not fred because it is static. Instead, it is given back to the OS together with anything else the process used when the process terminates.

Now, if hello had been declared as a local variable of create, then the destructor would be called at the end of that function. In that case, hello would have been allocated at automatic storage, and would be fred at the end of create.



Related Topics



Leave a reply



Submit