Can Structs Contain Fields of Reference Types

Can structs contain fields of reference types

Yes, they can. Is it a good idea? Well, that depends on the situation. Personally I rarely create my own structs in the first place... I would treat any new user-defined struct with a certain degree of scepticism. I'm not suggesting that it's always the wrong option, just that it needs more of a clear argument than a class.

It would be a bad idea for a struct to have a reference to a mutable object though... otherwise you can have two values which look independent but aren't:

MyValueType foo = ...;
MyValueType bar = foo; // Value type, hence copy...

foo.List.Add("x");
// Eek, bar's list has now changed too!

Mutable structs are evil. Immutable structs with references to mutable types are sneakily evil in different ways.

Why reference types inside structs behave like value types?

strings are reference types that have pointers stored on stack while their actual contents stored on heap

No no no. First off, stop thinking about stack and heap. This is almost always the wrong way to think in C#. C# manages storage lifetime for you.

Second, though references may be implemented as pointers, references are not logically pointers. References are references. C# has both references and pointers. Don't mix them up. There is no pointer to string in C#, ever. There are references to string.

Third, a reference to a string could be stored on the stack but it could also be stored on the heap. When you have an array of references to string, the array contents are on the heap.

Now let's come to your actual question.

    Person person_1 = new Person();
person_1.name = "Person 1";
Person person_2 = person_1; // This is the interesting line
person_2.name = "Person 2";

Let's illustrate what the code does logically. Your Person struct is nothing more than a string reference, so your program is the same as:

string person_1_name = null; // That's what new does on a struct
person_1_name = "Person 1";
string person_2_name = person_1_name; // Now they refer to the same string
person_2_name = "Person 2"; // And now they refer to different strings

When you say person2 = person1 that does not mean that the variable person1 is now an alias for the variable person2. (There is a way to do that in C#, but this is not it.) It means "copy the contents of person1 to person2". The reference to the string is the value that is copied.

If that's not clear try drawing boxes for variables and arrows for references; when the struct is copied, a copy of the arrow is made, not a copy of the box.

Struct containing reference types

You have correctly understood that with structs, address1 and address2 are not the same object. The values were copied. However, for the field, this is a simple case of reassignment. It has nothing to do with the fact that string is a reference type or any special rules or any suggestion of immutability. You have simply reassigned a property or field with another value.

someStruct.SomeString = "A";
anotherStruct = someStruct;
anotherStruct.SomeString = "B"; // would never affect someStruct

You have overwritten the reference in this example. The fact that for a brief moment, both structs' fields contained the same reference is of no importance. In your second example, you did something very different.

someStruct.IP.SomeString = "A";
anotherStruct = someStruct;
anotherStruct.IP.SomeString = "B";

In this case, the value of IP has not changed. Part of IP's state has changed. Each struct's field is still referencing the same IP.

Put in simpler terms

var foo = new Foo(); // Foo is class
var other = foo;
// other and foo contain same value, a reference to an object of type Foo
other = new Foo(); // was foo modified? no!

int x = 1;
int y = x;
y = 2; // was x modified? of course not.

string s = "S";
string t = s;
t = "T"; // is s "T"? (again, no)

Variables and fields hold values. For classes, those values are references to objects. Two variables or fields can hold the same reference, but that does not mean those variables themselves are linked. They are not connected in anyway, they simply hold a common value. When you replace a value for one variable or field, the other variable is not affected.


Not on the specific topic, but it is worth noting that mutable structs are viewed by many as evil. Others don't quite hold the same view, or at least not as religiously. (However, it is also worth noting that had Address been a class, then address1 and address2 would hold the same value (the reference to the Address object), and modification to the state of address1 would be visible via address2 as long as neither address1 or address2 are themselves reassigned.)

If this is an actual representation of your code, it would be worth doing some research on mutable structs so you at least have a full understanding of various pitfalls you may encounter.

Does it make sense to define a struct with a reference type member?

Nine times out of ten, you should be creating a class rather than a structure in the first place. Structures and classes have very different semantics in C#, compared to what you might find in C++, for example. Most programmers who use a structure should have used a class, making questions like this one quite frankly irrelevant.

Here are some quick rules about when you should choose a structure over a class:

  1. Never.

    ...Oh, you're still reading? You're persistent. Okay, fine.
  2. When you have an explicit need for value-type semantics, as opposed to reference type semantics.
  3. When you have a very small type (the rule of thumb is a memory footprint less than 16 bytes).
  4. When objects represented by your struct will be short-lived and immutable (won't change).
  5. And occasionally, for interop purposes with native code that uses structures.

But if you've made an informed decision and are truly confident that you do, in fact, need a structure rather than a class, you need to revisit point number 2 and understand what value type semantics are. Jon Skeet's article here should go a long way towards clarifying the distinction.

Once you've done that, you should understand why defining a reference type inside of a value type (struct) is not a problem. Reference types are like pointers. The field inside of the structure doesn't store the actual type; rather, it stores a pointer (or a reference) to that type. There's nothing contradictory or wrong about declaring a struct with a field containing a reference type. It will neither "slow the object" nor will it "call GC", the two concerns you express in a comment.

Reference type in struct in C#

The "kind" of type (value/reference) has little to do with how instances are allocated. It's all about life time, and there are more ways to allocate than "heap" and "stack". Read The Truth About Value Types.

But insofar your question makes sense: A struct's member types do not affect how struct instances are allocated, because they do not affect the lifetime of the object. Same goes for classes, by the way.

The member e will be a part of the value type object and allocated where it may be. This member is a reference, and hence any actual Employee object e refers to will be allocated somewhere else1. Though it sounds like one, this is not a special rule; locals and class members and array items behave the same way. It does not defeat the point of value types, rather maintains the benefits of both value and reference types. The value type instances are still separate values instead of being aliased, and they still have simpler and shorter life time allowing better allocation choices with less effort. The reference type instances are still shared and (potentially) long-lived.

1 At least conceptually and in the current implementations; in very simple cases optimizations (escape analysis+allocation sinking) could merge these allocations, but no CLR I'm aware of does that.

Struct with reference types and GC?

Point includes a reference to an object on the heap. This will become eligible for collection as soon as no more copies of that Point exist with that reference. Noting that:

Point p1 = new Point(1,2);
Point p2 = p1;

is 2 copies, each with a reference to the same object on the heap. If those points are stored as fields on an object somewhere, then obviously the lifetime of the object will be at least as long as the object with those fields. If those points are only variables on the stack, then it gets more complex, because the GC may take into account whether a variable is ever read again. If it isn't, the variable might not effectively exist (or: it might).

The path can be very indirect, but it essentially comes down to: can GC get to the object, starting from the GC roots.

Are Structs with Struct-Arrays value or reference-based?

A structure is a value type and a class is a reference type. That never changes.

If you have a local variable in a method in your code, when that code is executed, space is allocated for that variable on the stack. If that variable is a value type then the structure instance will be stored in the variable itself while, if the variable is a reference type, space will be allocated for the object on the heap and a reference to that object will be stored in the variable.

When an object is created, whether on the stack or the heap, that object contains its member variables. If the object is created on the stack then the member variables exist on the stack and if the object is created on the heap then the member variables exist on the heap. Whether those member variables exist on the stack or the heap, they still behave exactly as value types and reference types always do, i.e. the value type variables contain the objects and the reference type variables contain references to objects created on the heap.

If you have a structure with a member variable that is an array then the structure will behave like value types always do, i.e. the object will be stored in the variable, wherever that variable happens to be. The array field will contain a reference to an array created on the heap. If the array is of a value type then the array will contain the element objects themselves while an array that is of a reference type will contain references to objects stored elsewhere on the heap.

It's pretty simple really:

  • Local variables are stored on the stack.
  • Member variables are stored within the object, wherever that is stored.
  • Value type objects are stored in the variable, wherever that is stored.
  • Reference type objects are stored on the heap and a reference to them is stored in the variable, wherever that is stored.

Storing reference types in Struct

If you need the enhanced capabilities that a class offers, such as inheritance, then switch. If not, a struct can be a bit "lighter," but unless you anticipate some performance issues, such as garbage collection inside of a tight loop with a LOT of iterations, the necessity to pass structs around with ref whenever you want a method to make modifications, etc can create unnecessary work. (Though, in that example, destroying a struct that has reference-type properties also causes GC).

The practical upshot being: whether to use a struct or a class is a matter of your use case, not the number of properties that you have.

For a good explanation of the differences between and relative strengths and weakness of classes and structs, see this MSDN article.

For Eric Lippert's excellent note on garbage collection, structs and classes, see his response to this question.

Struct has class reference then it will store on stack or heap C#

struct is a value type, and is therefore allocated on the stack

This is incorrect. A more correct statement would be "value types can be stored on the stack", see the truth about value types. The value type will probably be stored on the stack if given as a parameter, even if the jitter if free to store it wherever it damn pleases. But if the struct is part of a class, it will be stored on the heap, with all the other fields of the class.

Now, the reference to A will be stored as part of the struct, i.e. 8 bytes on x64. But the actual object it points to, including the int Value, will be stored on the heap.

Are structs value types because their structure is known at compile time?

Structs can be treated as value types (as in, they are copied when passed as arguments, and stored inline in an array, etc) because their size is known at the jitter's compile time (or, to be more precise, because their size can be inferred from a variable/field's declared type).

Whyyy is their size known at compile time? It's not because "each struct contains only values for value types, and/or references to reference types, which are of a known size." - the same could be said for reference types.

Their size is known at compile time because inheritance is disallowed. If you could subclass demo, then a variable of type demo could point to an object much larger than demo.

Pretend for a second you could subclass structs, and remember that value types are stored inline in an array:

//4 bytes (1 integer)
public struct A {int x;}

//8 bytes (2 integers)
public struct B : A {int y;}

//An array of A's with enough space for 1 instance of A, that is, 4 bytes
A[] array = new A[1];

//instances of B don't fit in the array
a[0] = new B();

As Matthew points out in the comments, C++ allows you to do this, but field B.y will be trimmed and lost.



Related Topics



Leave a reply



Submit