Reference Types Live on the Heap, Value Types Live on the Stack

Reference types live on the heap, value types live on the stack

https://learn.microsoft.com/en-us/archive/blogs/ericlippert/the-stack-is-an-implementation-detail-part-one

The whole "reference types on the heap, value types on the stack" is not only a bad way to look at it, but it's wrong too.

Where does a value type-variable - which is returned by ref - live? Stack or heap?

I feel like you understand yourself already why it does not work. You cannot return local variable by reference from method (unless it's ref local), because in most cases lifetime of local variable is the method, so its reference outside of method does not have any meaning (outside of method this variable is dead and location where it were before might contain anything). As documentation states:

The return value must have a lifetime that extends beyond the
execution of the method. In other words, it cannot be a local variable
in the method that returns it

In practice some local variables might live longer than execution of method they are declared in. For example, variables captured by closure:

int myLocal = 5;
SomeMethodWhichAcceptsDelegate(() => DoStuff(myLocal));
return ref myLocal;

However, this introduces additional complications without any benefits, so this is also forbidden, even though lifetime of myLocal might be much longer than containing method.

It's better to not think about it in terms of stack and heap. For example you might think that you cannot return reference to something allocated on stack from the method via ref return. That's not true, for example:

private void Test() {
int myLocal = 4;
GetX(ref myLocal);
}

private ref int GetX(ref int i) {
return ref i;
}

Here myLocal is clearly on stack, and we pass it by reference to GetX and then return this (stack allocated) variable with return ref.

So just think about it in terms of variable lifetimes and not stack\heap.

In your second example, lifetime of _myInt field is clearly longer than execution of GetX, so there is no problem to return it by reference.

Note also that whether you return value type or reference type with return ref doesn't make any difference in context of this question.

How is value type array allocated in a heap?

How can array require only a single heap allocation?... or what does it mean by single heap allocation

First of all, let's clarify what we mean by "heap" vs "stack".

Most programming environments today are stack-based. As you run a program, each time you call a method a new entry is pushed onto a special stack provided for your program. This stack entry (or frame) tells the system where to look for the method's executable code, what arguments were passed, and exactly where to return to in the calling code after the method exits. When a method finishes, it's entry is removed (popped) from the stack, so the program can go back to the previous method. When the stack is empty, the program has finished.1 There is often support for this special stack directly in the CPU.

The memory for the stack is allocated when the program is first launched, which means the stack itself has a fixed (limited) size. This is where "Stack Overflows" come from; get too deep down too many method calls, and the stack will run out of space. Each frame on the stack also has a certain amount of space for local variables for the method, and this is the memory we're talking about when we say value types live on the stack. The local variables stored on the stack do not require new allocations; the memory is already there. Just remember: this only applies in the context of local variables in a method.

The heap, on the other hand, is memory not automatically granted to the program. It is memory the program must request above and beyond it's core allocation. Heap memory has to be managed more carefully (so it doesn't leak — but we have a garbage collector to help with this), but there is (usually) much more of it available. Because it has to be granted by the operating system on request, initial allocations for the heap are also a little slower than memory used from the stack.2 For reference types, you can think of the new keyword as requesting a new heap memory allocation.

As a broad generalization, we say reference types are allocated on the heap, and value types are allocated on the stack (though there are plenty of exceptions for this3).


Now we understand this much, we can start to look at how .Net handles arrays.

The core array type itself is a reference type. What I mean is, for any given type T, the T may (or may not) be a value type, but an array of T (T[]) is always a reference type. In the "stack vs heap" context, this means creating a new array is a heap allocation, even if T is a value type. Arrays in .Net also have a fixed size4.

An additional attribute of value types is they also have a known/fixed size, based on the members. Therefore, an array of value types has a fixed number of elements, each with a known fixed size. That's enough information so allocating a new array of value types will get all the space for the array object and it's elements in single heap allocation. The value of each item (not just a reference) is held right there with the array's core memory. Now we have a bunch of value-type objects, but their memory is on the heap, rather than the stack.

This can be further complicated by a value type with one or more reference type members. In this situation, the space for the value type is allocated as normal, but the the part of the value for the reference members is just a reference. It still requires separate allocations or assignments to populate those reference members.

For an array holding reference types, the initial heap allocation for the array still allocates space for all the elements, but space reserved for each element is only enough for the reference. That is, initially each element in the array is still null. To populate the array you must still set those references to objects in memory, either by assigning existing objects or allocating new ones.

Finally, just as we were able to use arrays to get a value-type onto the heap, instead of the stack, there are also ways to force reference types to allocate from the stack. However, generally you should not do this unless you really understand the full implications.


1) There are different conventions on exactly when a frame is pushed/popped for each method call, depending on the platform, compiler configuration, and more, so only look at this paragraph for the general idea; the exact specifics will be incorrect in some particulars on any given platform.

2) For future reading, it is also useful to understand how programs handle addressing for heap memory.

3) Eric Lippert has an excellent write-up of this topic.

4) That is, arrays in .Net they are true arrays in the full formal computer science sense, unlike the associative array-like collection types in many other platforms. .Net has these, too, but it calls them what they are: collections rather than arrays.

Value types in object stored in heap as well?

They are stored in the heap, inside of the memory allocated for the reference type. In addition, value types are often stored in places other than "the stack". However, the CLI spec does not specify where the memory pool that stores value types resides - it's an implementation detail that should not matter.

Why are reference types stored in heap

You can't generally store reference types on stack because the stack frame is destroyed upon method return. If you saved a reference to an object so it can be dereferenced after the method completes, you'd be dereferencing a non-existent stack location.

The HotSpot JVM can perform escape analysis and, if it determines that an object cannot possibly escape the method scope, it will in fact allocate it on the stack.

Stack and heap misunderstanding in Swift

I've always known that reference type variables are stored in the heap while value type variables are stored in the stack.

This is only partially true in Swift. In general, Swift makes no guarantees about where objects and values are stored, except that:

  1. Reference types have a stable location in memory, so that all references to the same object point to exactly the same place, and
  2. Value types are not guaranteed to have a stable location in memory, and can be copied arbitrarily as the compiler sees fit

This technically means that object types can be stored on the stack if the compiler knows that an object is created and destructed within the same stack frame with no escaping references to it, but in practice, you can basically assume that all objects are allocated on the heap.

For value types, the story is a little more complicated:

  • Unless a location-based reference is required of a value (e.g., taking a reference to a struct with &), a struct may be located entirely in registers: operating on small structs may place its members in CPU registers so it never even lives in memory. (This is especially the case for small, possibly short-lived value types like Ints and Doubles, which are guaranteed to fit in registers)
  • Large value types do actually get heap-allocated: although this is an implementation detail of Swift that theoretically could change in the future, structs which are larger than 3 machine words (e.g., larger than 12 bytes on a 32-bit machine, or 24 bytes on a 64-bit machine) are pretty much guaranteed to be allocated and stored on the heap. This doesn't conflict with the value-ness of a value type: it can still be copied arbitrarily as the compiler wishes, and the compiler does a really good job of avoiding unnecessary allocations where it can

So where are ints, doubles, strings, etc. are kept when they are defined inside a class, aka reference type?

This is an excellent question that gets at the heart of what a value type is. One way to think of the storage of a value type is inline, wherever it needs to be. Imagine a

struct Point {
var x: Double
var y: Double
}

structure, which is laid out in memory. Ignoring the fact that Point itself is a struct for a second, where are x and y stored relative to Point? Well, inline wherever Point goes:

┌───────────┐
│ Point │
├─────┬─────┤
│ x │ y │
└─────┴─────┘

When you need to store a Point, the compiler ensures that you have enough space to store both x and y, usually one immediately following the other. If a Point is stored on the stack, then x and y are stored on the stack, one after the other; if Point is stored on the heap, then x and y live on the heap as part of Point. Wherever Swift places a Point, it always ensures you have enough space, and when you assign to x and y, they are written to that space. It doesn't terribly matter where that is.

And when Point is part of another object? e.g.

class Location {
var name: String
var point: Point
}

Then Point is also laid out inline wherever it is stored, and its values are laid out inline as well:

┌──────────────────────┐
│ Location │
├──────────┬───────────┤
│ │ Point │
│ name ├─────┬─────┤
│ │ x │ y │
└──────────┴─────┴─────┘

In this case, when you create a Location object, the compiler ensures that there's enough space to store a String and two Doubles, and lays them out one after another. Where that is, again, doesn't matter, but in this case, it's all on the heap (because Location is a reference type, which happens to contain values).


As for the other way around, object storage has to components:

  1. The variable you use to access the object, and
  2. The actual storage for the object

Let's say that we changed Point from being a struct to being a class. When before, Location stored the contents of Point directly, now, it only stores a reference to their actual storage in memory:

┌──────────────────────┐      ┌───────────┐
│ Location │ ┌───▶│ Point │
├──────────┬───────────┤ │ ├─────┬─────┤
│ name │ point ──┼─┘ │ x │ y │
└──────────┴───────────┘ └─────┴─────┘

Before, when Swift laid out space to create a Location, it was storing one String and two Doubles; now, it stores one String and one pointer to a Point. Unlike in languages like C or C++, you don't actually need to be aware of the fact that Location.point is now a pointer, and it doesn't actually change how you access the object; but under the hood, the size and "shape" of Location has changed.

The same goes for storing all other reference types, including closures. A variable holding a closure is largely just a pointer to some metadata for the closure, and a way to execute the closure's code (though the specifics of this are out of scope for this answer):

┌───────────────────────────────┐     ┌───────────┐
│ MyStruct │ │ closure │
├─────────┬─────────┬───────────┤ ┌──▶│ storage │
│ prop1 │ prop2 │ closure ─┼─┘ │ + code │
└─────────┴─────────┴───────────┘ └───────────┘

clearing doubts about value and reference types

The first picture is better, l and k are different variables, occupying different places in memory.

value types may be allocated on heap, depending upon the how jitter sees it fit

Actually it depends more on the context and the way a value is used. A value-type field would always be allocated on the heap, boxing and closures are other reasons.

However, the 2nd picture applies when l is a ref parameter:

MyClass k = new ...;
M(ref k);

void M(ref MyClass l) { /* Here l is an alias for k */ }

then 2. Can we force a reference type to be allocated on stack?

There is something like stackalloc but it's an optimization that is 'invisible' to a C# programmer.

The simple and most useful answer is: No.

How long do C# reference types live inside a method?

Objects that are not reachable are marked as collectable. When the object is collected depends on the GC; if there is no memory pressure it might never be collected until the application ends.

Its important to note that the rule is "Object is not reachable", not that there is no references pointing at it, which is not the same:

void Foo() {
var a = new A();
var b = new B();
a.b = b;
b.a = a; }

Both a and b will be marked as unreachable when Foo exits even though both would have a reference counter greater than 0.



Related Topics



Leave a reply



Submit