Python:When Is a Variable Passed by Reference and When by Value

Python : When is a variable passed by reference and when by value?

Everything in Python is passed and assigned by value, in the same way that everything is passed and assigned by value in Java. Every value in Python is a reference (pointer) to an object. Objects cannot be values. Assignment always copies the value (which is a pointer); two such pointers can thus point to the same object. Objects are never copied unless you're doing something explicit to copy them.

For your case, every iteration of the loop assigns an element of the list into the variable loc. You then assign something else to the variable loc. All these values are pointers; you're assigning pointers; but you do not affect any objects in any way.

How do I pass a variable by reference?

Arguments are passed by assignment. The rationale behind this is twofold:

  1. the parameter passed in is actually a reference to an object (but the reference is passed by value)
  2. some data types are mutable, but others aren't

So:

  • If you pass a mutable object into a method, the method gets a reference to that same object and you can mutate it to your heart's delight, but if you rebind the reference in the method, the outer scope will know nothing about it, and after you're done, the outer reference will still point at the original object.

  • If you pass an immutable object to a method, you still can't rebind the outer reference, and you can't even mutate the object.

To make it even more clear, let's have some examples.

List - a mutable type

Let's try to modify the list that was passed to a method:

def try_to_change_list_contents(the_list):
print('got', the_list)
the_list.append('four')
print('changed to', the_list)

outer_list = ['one', 'two', 'three']

print('before, outer_list =', outer_list)
try_to_change_list_contents(outer_list)
print('after, outer_list =', outer_list)

Output:

before, outer_list = ['one', 'two', 'three']
got ['one', 'two', 'three']
changed to ['one', 'two', 'three', 'four']
after, outer_list = ['one', 'two', 'three', 'four']

Since the parameter passed in is a reference to outer_list, not a copy of it, we can use the mutating list methods to change it and have the changes reflected in the outer scope.

Now let's see what happens when we try to change the reference that was passed in as a parameter:

def try_to_change_list_reference(the_list):
print('got', the_list)
the_list = ['and', 'we', 'can', 'not', 'lie']
print('set to', the_list)

outer_list = ['we', 'like', 'proper', 'English']

print('before, outer_list =', outer_list)
try_to_change_list_reference(outer_list)
print('after, outer_list =', outer_list)

Output:

before, outer_list = ['we', 'like', 'proper', 'English']
got ['we', 'like', 'proper', 'English']
set to ['and', 'we', 'can', 'not', 'lie']
after, outer_list = ['we', 'like', 'proper', 'English']

Since the the_list parameter was passed by value, assigning a new list to it had no effect that the code outside the method could see. The the_list was a copy of the outer_list reference, and we had the_list point to a new list, but there was no way to change where outer_list pointed.

String - an immutable type

It's immutable, so there's nothing we can do to change the contents of the string

Now, let's try to change the reference

def try_to_change_string_reference(the_string):
print('got', the_string)
the_string = 'In a kingdom by the sea'
print('set to', the_string)

outer_string = 'It was many and many a year ago'

print('before, outer_string =', outer_string)
try_to_change_string_reference(outer_string)
print('after, outer_string =', outer_string)

Output:

before, outer_string = It was many and many a year ago
got It was many and many a year ago
set to In a kingdom by the sea
after, outer_string = It was many and many a year ago

Again, since the the_string parameter was passed by value, assigning a new string to it had no effect that the code outside the method could see. The the_string was a copy of the outer_string reference, and we had the_string point to a new string, but there was no way to change where outer_string pointed.

I hope this clears things up a little.

EDIT: It's been noted that this doesn't answer the question that @David originally asked, "Is there something I can do to pass the variable by actual reference?". Let's work on that.

How do we get around this?

As @Andrea's answer shows, you could return the new value. This doesn't change the way things are passed in, but does let you get the information you want back out:

def return_a_whole_new_string(the_string):
new_string = something_to_do_with_the_old_string(the_string)
return new_string

# then you could call it like
my_string = return_a_whole_new_string(my_string)

If you really wanted to avoid using a return value, you could create a class to hold your value and pass it into the function or use an existing class, like a list:

def use_a_wrapper_to_simulate_pass_by_reference(stuff_to_change):
new_string = something_to_do_with_the_old_string(stuff_to_change[0])
stuff_to_change[0] = new_string

# then you could call it like
wrapper = [my_string]
use_a_wrapper_to_simulate_pass_by_reference(wrapper)

do_something_with(wrapper[0])

Although this seems a little cumbersome.

python pandas dataframe, is it pass-by-value or pass-by-reference

The short answer is, Python always does pass-by-value, but every Python variable is actually a pointer to some object, so sometimes it looks like pass-by-reference.

In Python every object is either mutable or non-mutable. e.g., lists, dicts, modules and Pandas data frames are mutable, and ints, strings and tuples are non-mutable. Mutable objects can be changed internally (e.g., add an element to a list), but non-mutable objects cannot.

As I said at the start, you can think of every Python variable as a pointer to an object. When you pass a variable to a function, the variable (pointer) within the function is always a copy of the variable (pointer) that was passed in. So if you assign something new to the internal variable, all you are doing is changing the local variable to point to a different object. This doesn't alter (mutate) the original object that the variable pointed to, nor does it make the external variable point to the new object. At this point, the external variable still points to the original object, but the internal variable points to a new object.

If you want to alter the original object (only possible with mutable data types), you have to do something that alters the object without assigning a completely new value to the local variable. This is why letgo() and letgo3() leave the external item unaltered, but letgo2() alters it.

As @ursan pointed out, if letgo() used something like this instead, then it would alter (mutate) the original object that df points to, which would change the value seen via the global a variable:

def letgo(df):
df.drop('b', axis=1, inplace=True)

a = pd.DataFrame({'a':[1,2], 'b':[3,4]})
letgo(a) # will alter a

In some cases, you can completely hollow out the original variable and refill it with new data, without actually doing a direct assignment, e.g. this will alter the original object that v points to, which will change the data seen when you use v later:

def letgo3(x):
x[:] = np.array([[3,3],[3,3]])

v = np.empty((2, 2))
letgo3(v) # will alter v

Notice that I'm not assigning something directly to x; I'm assigning something to the entire internal range of x.

If you absolutely must create a completely new object and make it visible externally (which is sometimes the case with pandas), you have two options. The 'clean' option would be just to return the new object, e.g.,

def letgo(df):
df = df.drop('b',axis=1)
return df

a = pd.DataFrame({'a':[1,2], 'b':[3,4]})
a = letgo(a)

Another option would be to reach outside your function and directly alter a global variable. This changes a to point to a new object, and any function that refers to a afterward will see that new object:

def letgo():
global a
a = a.drop('b',axis=1)

a = pd.DataFrame({'a':[1,2], 'b':[3,4]})
letgo() # will alter a!

Directly altering global variables is usually a bad idea, because anyone who reads your code will have a hard time figuring out how a got changed. (I generally use global variables for shared parameters used by many functions in a script, but I don't let them alter those global variables.)

Python Variable Scope (passing by reference or copy?)

Long story short: Python uses pass-by-value, but the things that are passed by value are references. The actual objects have 0 to infinity references pointing at them, and for purposes of mutating that object, it doesn't matter who you are and how you got a reference to the object.

Going through your example step by step:

  • L = [...] creates a list object somewhere in memory, the local variable L stores a reference to that object.
  • sorting (strictly speaking, the callable object pointed to be the global name sorting) gets called with a copy of the reference stored by L, and stores it in a local called x.
  • The method sort of the object pointed to by the reference contained in x is invoked. It gets a reference to the object (in the self parameter) as well. It somehow mutates that object (the object, not some reference to the object, which is merely more than a memory address).
  • Now, since references were copied, but not the object the references point to, all the other references we discussed still point to the same object. The one object that was modified "in-place".
  • testScope then returns another reference to that list object.
  • print uses it to request a string representation (calls the __str__ method) and outputs it. Since it's still the same object, of course it's printing the sorted list.

So whenever you pass an object anywhere, you share it with whoever recives it. Functions can (but usually won't) mutate the objects (pointed to by the references) they are passed, from calling mutating methods to assigning members. Note though that assigning a member is different from assigning a plain ol' name - which merely means mutating your local scope, not any of the caller's objects. So you can't mutate the caller's locals (this is why it's not pass-by-reference).

Further reading: A discussion on effbot.org why it's not pass-by-reference and not what most people would call pass-by-value.

Python functions call by reference

You can not change an immutable object, like str or tuple, inside a function in Python, but you can do things like:

def foo(y):
y[0] = y[0]**2

x = [5]
foo(x)
print x[0] # prints 25

That is a weird way to go about it, however, unless you need to always square certain elements in an array.

Note that in Python, you can also return more than one value, making some of the use cases for pass by reference less important:

def foo(x, y):
return x**2, y**2

a = 2
b = 3
a, b = foo(a, b) # a == 4; b == 9

When you return values like that, they are being returned as a Tuple which is in turn unpacked.

edit:
Another way to think about this is that, while you can't explicitly pass variables by reference in Python, you can modify the properties of objects that were passed in. In my example (and others) you can modify members of the list that was passed in. You would not, however, be able to reassign the passed in variable entirely. For instance, see the following two pieces of code look like they might do something similar, but end up with different results:

def clear_a(x):
x = []

def clear_b(x):
while x: x.pop()

z = [1,2,3]
clear_a(z) # z will not be changed
clear_b(z) # z will be emptied


Related Topics



Leave a reply



Submit