Reference Type Still Needs Pass by Ref

Reference type still needs pass by ref?

Everything is passed by value in C#. However, when you pass a reference type, the reference itself is being passed by value, i.e., a copy of the original reference is passed. So, you can change the state of object that the reference copy points to, but if you assign a new value to the reference you are only changing what the copy points to, not the original reference.

When you use the 'ref' keyword it tells the compiler to pass the original reference, not a copy, so you can modify what the reference points to inside of the function. However, the need for this is usually rare and is most often used when you need to return multiple values from a method.

An example:

class Foo
{
int ID { get; set; }

public Foo( int id )
{
ID = id;
}
}

void Main( )
{
Foo f = new Foo( 1 );
Console.WriteLine( f.ID ); // prints "1"
ChangeId( f );
Console.WriteLine( f.ID ); // prints "5"
ChangeRef( f );
Console.WriteLine( f.ID ); // still prints "5", only changed what the copy was pointing to
}

static void ChangeId( Foo f )
{
f.ID = 5;
}

static void ChangeRef( Foo f )
{
f = new Foo( 10 );
}

What is the use of ref for reference-type variables in C#?

You can change what foo points to using y:

Foo foo = new Foo("1");

void Bar(ref Foo y)
{
y = new Foo("2");
}

Bar(ref foo);
// foo.Name == "2"

Passing a reference type as a parameter by ref

When you pass by reference, you are effectively creating an alias for a variable, so in Switcharoo, pValue is an alias for x in the Go method. As a result assigning to pValue is an assignment to x.

The type of x in Go is Thing, and at runtime this is initially pointing to an instance of the class Animal. After calling Switcharoo, x is pointing to an instance of the Vegetable class instead. The original Animal instance is now unreachable and can be collected.

When using ref is it the variable which is passed by reference, so it works the same way for references and value types like int. In Go, x will (probably) exist on the stack and before calling Switcharoo its value will be the address of the Animal instance. Inside Switcharoo, pValue is an alias for the variable x. This may be implemented as a pointer to the variable in Go but the semantics of ref do not require using pointers.

The specification describes the semantics of ref parameters:

5.1.5 Reference parameters

A reference parameter does not create a new storage location. Instead,
a reference parameter represents the same storage location as the
variable given as the argument in the function member or anonymous
function invocation. Thus, the value of a reference parameter is always
the same as the underlying variable.

ByVal and ByRef with reference type

Since you declared TypeTest as a Class, that makes it a reference type (as opposed to Structure which is used to declare value types). Reference-type variables act as pointers to objects whereas value-type variables store the object data directly.

You are correct in your understanding that ByRef allows you to change the value of the argument variable whereas ByVal does not. When using value-types, the difference between ByVal and ByRef is very clear, but when you're using using reference-types, the behavior is a little less expected. The reason that you can change the property values of a reference-type object, even when it's passed ByVal, is because the value of the variable is the pointer to the object, not the the object itself. Changing a property of the object isn't changing the value of the variable at all. The variable still contains the pointer to the same object.

That might lead you to believe that there is no difference between ByVal and ByRef for reference-types, but that's not true. There is a difference. The difference is, when you pass a reference-type argument to a ByRef parameter, the method that you're calling is allowed to change the object to which the original variable is pointing. In other words, not only is the method able to change the properties of the object, but it's also able to point the argument variable to a different object altogether. For instance:

Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Dim t1 As TypeTest = New TypeTest
t1.Variable1 = "Thursday"
TestByVal(t1)
MsgBox(t1.variable1) ' Displays "Thursday"
TestByRef(t1)
MsgBox(t1.variable1) ' Displays "Friday"
End Sub

Public Sub TestByVal(ByVal t1 As TypeTest)
t1 = New TypeTest()
t1.Variable1 = "Friday"
End Sub

Public Sub TestByRef(ByRef t1 As TypeTest)
t1 = New TypeTest()
t1.Variable1 = "Friday"
End Sub

pass by reference without the ref keyword

Your confusion is a very common one. The essential point is realising that "reference types" and "passing by refrence" (ref keyboard) are totally independent. In this specific case, since byte[] is a reference type (as are all arrays), it means the object is not copied when you pass it around, hence you are always referring to the same object.

I strongly recommend that you read Jon Skeet's excellent article on Parameter passing in C#, and all should become clear...

List passed by ref - help me explain this behaviour

You are passing a reference to the list, but your aren't passing the list variable by reference - so when you call ChangeList the value of the variable (i.e. the reference - think "pointer") is copied - and changes to the value of the parameter inside ChangeList aren't seen by TestMethod.

try:

private void ChangeList(ref List<int> myList) {...}
...
ChangeList(ref myList);

This then passes a reference to the local-variable myRef (as declared in TestMethod); now, if you reassign the parameter inside ChangeList you are also reassigning the variable inside TestMethod.

Does it make sense to pass a reference type to a method as a parameter with 'ref' key?

It lets you change the reference variable itself, in addition to the object it's pointing to.

It makes sense if you think you might make the variable point to a different object (or to null) inside your method.

Otherwise, no.

C# pass by value vs. pass by reference

Re: OP's Assertion

It is universally acknowledged (in C# at least) that when you pass by reference, the method contains a reference to the object being manipulated, whereas when you pass by value, the method copies the value being manipulated ...

TL;DR

There's more to it than that. Unless you pass variables with the ref or out keywords, C# passes variables to methods by value, irrespective of whether the variable is a value type or a reference type.

  • If passed by reference, then the called function may change the variable's address at the call-site (i.e. change the original calling function's variable's assignment).

  • If a variable is passed by value:

    • if the called function re-assigns the variable, this change is local to the called function only, and will not affect the original variable in the calling function
    • however, if changes are made to the variable's fields or properties by the called function, it will depend on whether the variable is a value type or a reference type in order to determine whether the calling function will observe the changes made to this variable.

Since this is all rather complicated, I would recommend avoiding passing by reference if possible (instead, if you need to return multiple values from a function, use a composite class, struct, or Tuples as a return type instead of using the ref or out keywords on parameters)

Also, when passing reference types around, a lot of bugs can be avoided by not changing (mutating) fields and properties of an object passed into a method (for example, use C#'s immutable properties to prevent changes to properties, and strive to assign properties only once, during construction).

In Detail

The problem is that there are two distinct concepts:

  • Value Types (e.g. int) vs Reference Types (e.g. string, or custom classes)
  • Passing by Value (default behaviour) vs Passing by Reference(ref, out)

Unless you explicitly pass (any) variable by reference, by using the out or ref keywords, parameters are passed by value in C#, irrespective of whether the variable is a value type or reference type.

When passing value types (such as int, float or structs like DateTime) by value (i.e. without out or ref), the called function gets a copy of the entire value type (via the stack).

Any change to the value type, and any changes to any properties / fields of the copy will be lost when the called function is exited.

However, when passing reference types (e.g. custom classes like your MyPoint class) by value, it is the reference to the same, shared object instance which is copied and passed on the stack.

This means that:

  • If the passed object has mutable (settable) fields and properties, any changes to those fields or properties of the shared object are permanent (i.e. any changes to x or y are seen by anyone observing the object)
  • However, during method calls, the reference itself is still copied (passed by value), so if the parameter variable is reassigned, this change is made only to the local copy of the reference, so the change will not be seen by the caller. This is why your code doesn't work as expected

What happens here:

void Replace<T>(T a, T b) // Both a and b are passed by value
{
a = b; // reassignment is localized to method `Replace`
}

for reference types T, means that the local variable (stack) reference to the object a is reassigned to the local stack reference b. This reassign is local to this function only - as soon as scope leaves this function, the re-assignment is lost.

If you really want to replace the caller's references, you'll need to change the signature like so:

void Replace<T>(ref T a, T b) // a is passed by reference
{
a = b; // a is reassigned, and is also visible to the calling function
}

This changes the call to call by reference - in effect we are passing the address of the caller's variable to the function, which then allows the called method to alter the calling method's variable.

However, nowadays:

  • Passing by reference is generally regarded as a bad idea - instead, we should either pass return data in the return value, and if there is more than one variable to be returned, then use a Tuple or a custom class or struct which contains all such return variables.
  • Changing ('mutating') a shared value (and even reference) variable in a called method is frowned upon, especially by the Functional Programming community, as this can lead to tricky bugs, especially when using multiple threads. Instead, give preference to immutable variables, or if mutation is required, then consider changing a (potentially deep) copy of the variable. You might find topics around 'pure functions' and 'const correctness' interesting further reading.

Edit

These two diagrams may help with the explanation.

Pass by value (reference types):

In your first instance (Replace<T>(T a,T b)), a and b are passed by value. For reference types, this means the references are copied onto the stack and passed to the called function.

Sample Image

  1. Your initial code (I've called this main) allocates two MyPoint objects on the managed heap (I've called these point1 and point2), and then assigns two local variable references a and b, to reference the points, respectively (the light blue arrows):
MyPoint a = new MyPoint { x = 1, y = 2 }; // point1
MyPoint b = new MyPoint { x = 3, y = 4 }; // point2

  1. The call to Replace<Point>(a, b) then pushes a copy of the two references onto the stack (the red arrows). Method Replace sees these as the two parameters also named a and b, which still point to point1 and point2, respectively (the orange arrows).

  2. The assignment, a = b; then changes the Replace methods' a local variable such that a now points to the same object as referenced by b (i.e. point2). However, note that this change is only to Replace's local (stack) variables, and this change will only affect subsequent code in Replace (the dark blue line). It does NOT affect the calling function's variable references in any way, NOR does this change the point1 and point2 objects on the heap at all.

Pass by reference:

If however we we change the call to Replace<T>(ref T a, T b) and then change main to pass a by reference, i.e. Replace(ref a, b):

Sample Image

  1. As before, two point objects allocated on the heap.

  2. Now, when Replace(ref a, b) is called, while mains reference b (pointing to point2) is still copied during the call, a is now passed by reference, meaning that the "address" to main's a variable is passed to Replace.

  3. Now when the assignment a = b is made ...

  4. It is the the calling function, main's a variable reference which is now updated to reference point2. The change made by the re-assignment to a is now seen by both main and Replace. There are now no references to point1

Changes to (heap allocated) object instances are seen by all code referencing the object

In both scenarios above, no changes were actually made to the heap objects, point1 and point2, it was only local variable references which were passed and re-assigned.

However, if any changes were actually made to the heap objects point1 and point2, then all variable references to these objects would see these changes.

So, for example:

void main()
{
MyPoint a = new MyPoint { x = 1, y = 2 }; // point1
MyPoint b = new MyPoint { x = 3, y = 4 }; // point2

// Passed by value, but the properties x and y are being changed
DoSomething(a, b);

// a and b have been changed!
Assert.AreEqual(53, a.x);
Assert.AreEqual(21, b.y);
}

public void DoSomething(MyPoint a, MyPoint b)
{
a.x = 53;
b.y = 21;
}

Now, when execution returns to main, all references to point1 and point2, including main's variables a and b, which will now 'see' the changes when they next read the values for x and y of the points. You will also note that the variables a and b were still passed by value to DoSomething.

Changes to value types affect the local copy only

Value types (primitives like System.Int32, System.Double) and structs (like System.DateTime, or your own structs) are allocated on the stack, not the heap, and are copied verbatim onto the stack when passed into a call. This leads to a major difference in behaviour, since changes made by the called function to a value type field or property will only be observed locally by the called function, because it only will be mutating the local copy of the value type.

e.g. Consider the following code with an instance of the mutable struct, System.Drawing.Rectangle

public void SomeFunc(System.Drawing.Rectangle aRectangle)
{
// Only the local SomeFunc copy of aRectangle is changed:
aRectangle.X = 99;
// Passes - the changes last for the scope of the copied variable
Assert.AreEqual(99, aRectangle.X);
} // The copy aRectangle will be lost when the stack is popped.

// Which when called:
var myRectangle = new System.Drawing.Rectangle(10, 10, 20, 20);
// A copy of `myRectangle` is passed on the stack
SomeFunc(myRectangle);
// Test passes - the caller's struct has NOT been modified
Assert.AreEqual(10, myRectangle.X);

The above can be quite confusing and highlights why it is good practice to create your own custom structs as immutable.

The ref keyword works similarly to allow value type variables to be passed by reference, viz that the 'address' of the caller's value type variable is passed onto the stack, and assignment of the caller's assigned variable is now directly possible.



Related Topics



Leave a reply



Submit