C# Pass by Value VS. Pass by Reference

C# pass by value vs. pass by reference

Re: OP's Assertion

It is universally acknowledged (in C# at least) that when you pass by reference, the method contains a reference to the object being manipulated, whereas when you pass by value, the method copies the value being manipulated ...

TL;DR

There's more to it than that. Unless you pass variables with the ref or out keywords, C# passes variables to methods by value, irrespective of whether the variable is a value type or a reference type.

  • If passed by reference, then the called function may change the variable's address at the call-site (i.e. change the original calling function's variable's assignment).

  • If a variable is passed by value:

    • if the called function re-assigns the variable, this change is local to the called function only, and will not affect the original variable in the calling function
    • however, if changes are made to the variable's fields or properties by the called function, it will depend on whether the variable is a value type or a reference type in order to determine whether the calling function will observe the changes made to this variable.

Since this is all rather complicated, I would recommend avoiding passing by reference if possible (instead, if you need to return multiple values from a function, use a composite class, struct, or Tuples as a return type instead of using the ref or out keywords on parameters)

Also, when passing reference types around, a lot of bugs can be avoided by not changing (mutating) fields and properties of an object passed into a method (for example, use C#'s immutable properties to prevent changes to properties, and strive to assign properties only once, during construction).

In Detail

The problem is that there are two distinct concepts:

  • Value Types (e.g. int) vs Reference Types (e.g. string, or custom classes)
  • Passing by Value (default behaviour) vs Passing by Reference(ref, out)

Unless you explicitly pass (any) variable by reference, by using the out or ref keywords, parameters are passed by value in C#, irrespective of whether the variable is a value type or reference type.

When passing value types (such as int, float or structs like DateTime) by value (i.e. without out or ref), the called function gets a copy of the entire value type (via the stack).

Any change to the value type, and any changes to any properties / fields of the copy will be lost when the called function is exited.

However, when passing reference types (e.g. custom classes like your MyPoint class) by value, it is the reference to the same, shared object instance which is copied and passed on the stack.

This means that:

  • If the passed object has mutable (settable) fields and properties, any changes to those fields or properties of the shared object are permanent (i.e. any changes to x or y are seen by anyone observing the object)
  • However, during method calls, the reference itself is still copied (passed by value), so if the parameter variable is reassigned, this change is made only to the local copy of the reference, so the change will not be seen by the caller. This is why your code doesn't work as expected

What happens here:

void Replace<T>(T a, T b) // Both a and b are passed by value
{
a = b; // reassignment is localized to method `Replace`
}

for reference types T, means that the local variable (stack) reference to the object a is reassigned to the local stack reference b. This reassign is local to this function only - as soon as scope leaves this function, the re-assignment is lost.

If you really want to replace the caller's references, you'll need to change the signature like so:

void Replace<T>(ref T a, T b) // a is passed by reference
{
a = b; // a is reassigned, and is also visible to the calling function
}

This changes the call to call by reference - in effect we are passing the address of the caller's variable to the function, which then allows the called method to alter the calling method's variable.

However, nowadays:

  • Passing by reference is generally regarded as a bad idea - instead, we should either pass return data in the return value, and if there is more than one variable to be returned, then use a Tuple or a custom class or struct which contains all such return variables.
  • Changing ('mutating') a shared value (and even reference) variable in a called method is frowned upon, especially by the Functional Programming community, as this can lead to tricky bugs, especially when using multiple threads. Instead, give preference to immutable variables, or if mutation is required, then consider changing a (potentially deep) copy of the variable. You might find topics around 'pure functions' and 'const correctness' interesting further reading.

Edit

These two diagrams may help with the explanation.

Pass by value (reference types):

In your first instance (Replace<T>(T a,T b)), a and b are passed by value. For reference types, this means the references are copied onto the stack and passed to the called function.

Sample Image

  1. Your initial code (I've called this main) allocates two MyPoint objects on the managed heap (I've called these point1 and point2), and then assigns two local variable references a and b, to reference the points, respectively (the light blue arrows):
MyPoint a = new MyPoint { x = 1, y = 2 }; // point1
MyPoint b = new MyPoint { x = 3, y = 4 }; // point2

  1. The call to Replace<Point>(a, b) then pushes a copy of the two references onto the stack (the red arrows). Method Replace sees these as the two parameters also named a and b, which still point to point1 and point2, respectively (the orange arrows).

  2. The assignment, a = b; then changes the Replace methods' a local variable such that a now points to the same object as referenced by b (i.e. point2). However, note that this change is only to Replace's local (stack) variables, and this change will only affect subsequent code in Replace (the dark blue line). It does NOT affect the calling function's variable references in any way, NOR does this change the point1 and point2 objects on the heap at all.

Pass by reference:

If however we we change the call to Replace<T>(ref T a, T b) and then change main to pass a by reference, i.e. Replace(ref a, b):

Sample Image

  1. As before, two point objects allocated on the heap.

  2. Now, when Replace(ref a, b) is called, while mains reference b (pointing to point2) is still copied during the call, a is now passed by reference, meaning that the "address" to main's a variable is passed to Replace.

  3. Now when the assignment a = b is made ...

  4. It is the the calling function, main's a variable reference which is now updated to reference point2. The change made by the re-assignment to a is now seen by both main and Replace. There are now no references to point1

Changes to (heap allocated) object instances are seen by all code referencing the object

In both scenarios above, no changes were actually made to the heap objects, point1 and point2, it was only local variable references which were passed and re-assigned.

However, if any changes were actually made to the heap objects point1 and point2, then all variable references to these objects would see these changes.

So, for example:

void main()
{
MyPoint a = new MyPoint { x = 1, y = 2 }; // point1
MyPoint b = new MyPoint { x = 3, y = 4 }; // point2

// Passed by value, but the properties x and y are being changed
DoSomething(a, b);

// a and b have been changed!
Assert.AreEqual(53, a.x);
Assert.AreEqual(21, b.y);
}

public void DoSomething(MyPoint a, MyPoint b)
{
a.x = 53;
b.y = 21;
}

Now, when execution returns to main, all references to point1 and point2, including main's variables a and b, which will now 'see' the changes when they next read the values for x and y of the points. You will also note that the variables a and b were still passed by value to DoSomething.

Changes to value types affect the local copy only

Value types (primitives like System.Int32, System.Double) and structs (like System.DateTime, or your own structs) are allocated on the stack, not the heap, and are copied verbatim onto the stack when passed into a call. This leads to a major difference in behaviour, since changes made by the called function to a value type field or property will only be observed locally by the called function, because it only will be mutating the local copy of the value type.

e.g. Consider the following code with an instance of the mutable struct, System.Drawing.Rectangle

public void SomeFunc(System.Drawing.Rectangle aRectangle)
{
// Only the local SomeFunc copy of aRectangle is changed:
aRectangle.X = 99;
// Passes - the changes last for the scope of the copied variable
Assert.AreEqual(99, aRectangle.X);
} // The copy aRectangle will be lost when the stack is popped.

// Which when called:
var myRectangle = new System.Drawing.Rectangle(10, 10, 20, 20);
// A copy of `myRectangle` is passed on the stack
SomeFunc(myRectangle);
// Test passes - the caller's struct has NOT been modified
Assert.AreEqual(10, myRectangle.X);

The above can be quite confusing and highlights why it is good practice to create your own custom structs as immutable.

The ref keyword works similarly to allow value type variables to be passed by reference, viz that the 'address' of the caller's value type variable is passed onto the stack, and assignment of the caller's assigned variable is now directly possible.

Pass reference by reference vs pass reference by value - C#

I suggest that you check out this link. It's quite useful and contains very simple examples about Parameter passing in C#.

Reference parameters don't pass the values of the variables used in the function member invocation - they use the variables themselves. Rather than creating a new storage location for the variable in the function member declaration, the same storage location is used, so the value of the variable in the function member and the value of the reference parameter will always be the same. Reference parameters need the ref modifier as part of both the declaration and the invocation - that means it's always clear when you're passing something by reference. Let's look at our previous examples, just changing the parameter to be a reference parameter:

void Foo (ref StringBuilder x) {
x = null;
}

...

StringBuilder y = new StringBuilder();
y.Append ("hello");
Foo (ref y);
Console.WriteLine (y==null); // will write TRUE

IN YOUR EXAMPLE

int[] myArray = {1,2,3};
PassByVal(myArray);
PassByRef(ref myArray);

PassByVal(int[] array){
// the function copy the value of the pointer in a new location of memory
// the "copied" pointer still points to the array 123

// now you are modifying the object pointed by THE COPY of the pointer
// the original pointer still points to array 123
// the copy of the pointer will point to array 456
array = new int[] {7,8,9};

} // will not work

PassByRef(ref int[] array){
// here you are passing the pointer without creating a copy of it in a
// new location of memory

// we have not a original pointer and a "copyed" pointer
// we have only the original pointer and now whe point it to array 10,11,12
array = new int[] {10,11,12};
} // will work

Performance of pass by value vs. pass by reference in C# .NET

Only use ref if the method needs to alter the parameters, and these changes need to be passed onto the calling code. You should only optimize this if you have run it through a profiler and determined that the bottleneck is indeed the CLR copying the method parameters onto the stack.

Bear in mind the CLR is heavily optimized for calling methods with parameters, so I shouldn't think this would be the issue.

Passing Objects By Reference or Value in C#

Objects aren't passed at all. By default, the argument is evaluated and its value is passed, by value, as the initial value of the parameter of the method you're calling. Now the important point is that the value is a reference for reference types - a way of getting to an object (or null). Changes to that object will be visible from the caller. However, changing the value of the parameter to refer to a different object will not be visible when you're using pass by value, which is the default for all types.

If you want to use pass-by-reference, you must use out or ref, whether the parameter type is a value type or a reference type. In that case, effectively the variable itself is passed by reference, so the parameter uses the same storage location as the argument - and changes to the parameter itself are seen by the caller.

So:

public void Foo(Image image)
{
// This change won't be seen by the caller: it's changing the value
// of the parameter.
image = Image.FromStream(...);
}

public void Foo(ref Image image)
{
// This change *will* be seen by the caller: it's changing the value
// of the parameter, but we're using pass by reference
image = Image.FromStream(...);
}

public void Foo(Image image)
{
// This change *will* be seen by the caller: it's changing the data
// within the object that the parameter value refers to.
image.RotateFlip(...);
}

I have an article which goes into a lot more detail in this. Basically, "pass by reference" doesn't mean what you think it means.

can somebody explain me what does passing by value and Passing by reference mean in C#?

In simple terms...

"Passing by value" means that you pass the actual value of the variable into the function. So, in your example, it would pass the value 9.

"Passing by reference" means that you pass the variable itself into the function (not just the value). So, in your example, it would pass an integer object with the value of 9.

This has various consequences, and each is useful in different situations.

This answer has more thorough information:
What's the difference between passing by reference vs. passing by value?

Passing objects by reference vs value

What I don't understand is what happens when I invoke a method, what actually happens. Does new() get invoked? Does it just automagically copy the data? Or does it actually just point to the original object? And how does using ref and out affect this?

The short answer:

The empty constructor will not be called automatically, and it actually just points to the original object.

using ref and out does not affect this.

The long answer:

I think it would be easier to understand how C# handles passing arguments to a function.

Actually everything is being passed by value
Really?! Everything by value?
Yes! Everything!

Of course there must be some kind of a difference between passing classes and simple typed objects, such as an Integer, otherwise, it would be a huge step back performance wise.

Well the thing is, that behind the scenes when you pass a class instance of an object to a function, what is really being passed to the function is the pointer to the class. the pointer, of course, can be passed by value without causing performance issues.

Actually, everything is being passed by value; it's just that when
you're "passing an object", you're actually passing a reference to that
object (and you're passing that reference by value).

once we are in the function, given the argument pointer, we can relate to the object passed by reference.

You don't actually need to do anything for this, you can relate directly to the instance passed as the argument (as said before, this whole process is being done behind the scenes).

After understanding this, you probably understand that the empty constructor will not be called automatically, and it actually just points to the original object.


EDITED:

As to the out and ref, they allow functions to change the value of an arguments and have that change persist outside of the scope of the function.

In a nutshell, using the ref keyword for value types will act as follows:

int i = 42;
foo(ref i);

will translate in c++ to:

int i = 42;    
int* ptrI = &i;
foo(ptrI)

while omitting the ref will simply translate to:

int i = 42;
foo(i)

using those keywords for reference type objects, will allow you to reallocate memory to the passed argument, and make the reallocation persist outside of the scope of the function (for more details please refer to the MSDN page)

Side note:

The difference between ref and out is that out makes sure that the called function must assign a value to the out argument, while ref does not have this restriction, and then you should handle it by assigning some default value yourself, thus, ref Implies the the initial value of the argument is important to the function and might affect it's behaviour.

Pass by Reference vs Pass by value result

It would make a difference if the original variable were read during the course of the method. This could happen because:

  • Two parameters were both provided using the same underlying variable
  • The method invoked more code that read from the original variable
  • Other threads are involved

Here's an example in C#:

using System;

class Test
{
static void Main()
{
int p = 10;
Foo(ref p, () => Console.WriteLine(p));
}

static void Foo(ref int x, Action action)
{
action();
x = 20;
action();
}
}

The output of this is

10
20

... because when action() is invoked the second time, the value of p has already changed to 20. If this used pass-by-result, the value of p would only change when Foo returned.

What is different between Passing by value and Passing by reference using C#

In general, read my article about parameter passing.

The basic idea is:

If the argument is passed by reference, then changes to the parameter value within the method will affect the argument as well.

The subtle part is that if the parameter is a reference type, then doing:

someParameter.SomeProperty = "New Value";

isn't changing the value of the parameter. The parameter is just a reference, and the above doesn't change what the parameter refers to, just the data within the object. Here's an example of genuinely changing the parameter's value:

someParameter = new ParameterType();

Now for examples:

Simple example: passing an int by ref or by value

class Test
{
static void Main()
{
int i = 10;
PassByRef(ref i);
// Now i is 20
PassByValue(i);
// i is *still* 20
}

static void PassByRef(ref int x)
{
x = 20;
}

static void PassByValue(int x)
{
x = 50;
}
}

More complicated example: using reference types

class Test
{
static void Main()
{
StringBuilder builder = new StringBuilder();
PassByRef(ref builder);
// builder now refers to the StringBuilder
// constructed in PassByRef

PassByValueChangeContents(builder);
// builder still refers to the same StringBuilder
// but then contents has changed

PassByValueChangeParameter(builder);
// builder still refers to the same StringBuilder,
// not the new one created in PassByValueChangeParameter
}

static void PassByRef(ref StringBuilder x)
{
x = new StringBuilder("Created in PassByRef");
}

static void PassByValueChangeContents(StringBuilder x)
{
x.Append(" ... and changed in PassByValueChangeContents");
}

static void PassByValueChangeParameter(StringBuilder x)
{
// This new object won't be "seen" by the caller
x = new StringBuilder("Created in PassByValueChangeParameter");
}
}

Implications of pass-by-value vs pass-by-reference in C#

You've confused "pass by reference" with "reference type." Reference types are passed by value by default, unless you use the ref keyword. When you pass a reference type by value, it is the same as passing anything by value; you are passing a copy. In this case, you are passing a copy of the reference.

In this example...

private void SetName(Book book, string name)
{
book.Name = name;
}

...you are passing a reference type by value. The variable book is populated with a copy of the caller's variable book1. However, the variable itself contains a reference to the same Book object as the caller's, so setting its properties shows up in both places.

In this example...

private void GetBookSetName(Book book, string name)
{
book = new Book(name);
}

...you are also passing a reference type by value. However, you are overwriting the (copied) value with a reference to a new Book. It's just a copy, and has no effect on book1.

In this example...

private void GetBookSetName(ref Book book, string name)
{
book = new Book(name);
}

...you are passing a reference type by reference. The method receives a pointer to the original reference, which it can change. Therefore when you assign it a new Book, it shows up in both places.



Related Topics



Leave a reply



Submit