Understanding Python's Call-By-Object Style of Passing Function Arguments

Understanding Python's call-by-object style of passing function arguments

The key difference is that in C-style language, a variable is a box in memory in which you put stuff. In Python, a variable is a name.

Python is neither call-by-reference nor call-by-value. It's something much more sensible! (In fact, I learned Python before I learned the more common languages, so call-by-value and call-by-reference seem very strange to me.)

In Python, there are things and there are names. Lists, integers, strings, and custom objects are all things. x, y, and z are names. Writing

x = []

means "construct a new thing [] and give it the name x". Writing

x = []
foo = lambda x: x.append(None)
foo(x)

means "construct a new thing [] with name x, construct a new function (which is another thing) with name foo, and call foo on the thing with name x". Now foo just appends None to whatever it received, so this reduces to "append None to the the empty list". Writing

x = 0
def foo(x):
x += 1
foo(x)

means "construct a new thing 0 with name x, construct a new function foo, and call foo on x". Inside foo, the assignment just says "rename x to 1 plus what it used to be", but that doesn't change the thing 0.

Python functions call by reference

You can not change an immutable object, like str or tuple, inside a function in Python, but you can do things like:

def foo(y):
y[0] = y[0]**2

x = [5]
foo(x)
print x[0] # prints 25

That is a weird way to go about it, however, unless you need to always square certain elements in an array.

Note that in Python, you can also return more than one value, making some of the use cases for pass by reference less important:

def foo(x, y):
return x**2, y**2

a = 2
b = 3
a, b = foo(a, b) # a == 4; b == 9

When you return values like that, they are being returned as a Tuple which is in turn unpacked.

edit:
Another way to think about this is that, while you can't explicitly pass variables by reference in Python, you can modify the properties of objects that were passed in. In my example (and others) you can modify members of the list that was passed in. You would not, however, be able to reassign the passed in variable entirely. For instance, see the following two pieces of code look like they might do something similar, but end up with different results:

def clear_a(x):
x = []

def clear_b(x):
while x: x.pop()

z = [1,2,3]
clear_a(z) # z will not be changed
clear_b(z) # z will be emptied

Python - how to pass to a function argument type of a class object (typing)

Depending on whether you meant to pass a class (type) or an instance of a class, you’re looking for either typing.Type or simply the class.

Here’s a simple example to explain both situations:

from typing import Type, TypeVar

class Vehicle:
def __init__(self):
print("Creating a %s" % self.__class__.__name__)

def move(self):
print("This %s is moving…" % self.__class__.__name__)

TVehicle = TypeVar("TVehicle", bound=Vehicle)

class Car(Vehicle):
def honk(self) -> None:
print("tuuuuut")

class Bike(Vehicle):
def ring(self) -> None:
print("ring")

class Dog:
def bark(self) -> None:
print("woof!")

def move(v: Vehicle) -> None:
v.move()

def instantiate(class_to_instantiate: Type[TVehicle]) -> TVehicle:
return class_to_instantiate() # create an instance

move(Bike())
move(Car())

instantiate(Bike).ring()
instantiate(Car).honk()
#instantiate(Dog)

Car and Bike inherit from Vehicle, so they both get at least the move method and the custom __init__, which reveals the name of the class that invoked it.

Now, in the first function, move, one simply wants to specify that the argument v should be an instance of a Vehicle. The function calls Vehicle’s move method, which will reveal the name of the instance’s class from which the call originated.

In the second function, instantiate, the goal is to create an instance of a class. This works through type variables, which allow you in this example to specify that there’s a relation between the function’s input argument and output argument: if I were to call instantiate(Bike), I want the return type to be an instance of the Bike class, so that I may legally call its ring method. If you were to replace the TVehicle in this function definition simply by Vehicle, your type checking program would complain, because the return type would then be an instance of the Vehicle class, for which you do not have a guarantee that the ring method exists.
Finally, the Type part that you see in the argument of instantiate simply allows you to call the function with a class, so not with an instance of that class. This is useful e.g. in cases where you want to delay instantiation of a class.

Note that this is an example to explain how to do it. In a more professional setting, Vehicle would likely be an abstract base class and some methods here could be given as class methods.

Side notes on your code example:

  1. Note that if you don’t intend to write code that also works in Python2, you shouldn’t inherit from object (ref).
  2. Classes are typically written with CapWord names, as specified in PEP8, the Python style guide. Following this style makes your code more easily understandable by other developers.

How do I pass a variable by reference?

Arguments are passed by assignment. The rationale behind this is twofold:

  1. the parameter passed in is actually a reference to an object (but the reference is passed by value)
  2. some data types are mutable, but others aren't

So:

  • If you pass a mutable object into a method, the method gets a reference to that same object and you can mutate it to your heart's delight, but if you rebind the reference in the method, the outer scope will know nothing about it, and after you're done, the outer reference will still point at the original object.

  • If you pass an immutable object to a method, you still can't rebind the outer reference, and you can't even mutate the object.

To make it even more clear, let's have some examples.

List - a mutable type

Let's try to modify the list that was passed to a method:

def try_to_change_list_contents(the_list):
print('got', the_list)
the_list.append('four')
print('changed to', the_list)

outer_list = ['one', 'two', 'three']

print('before, outer_list =', outer_list)
try_to_change_list_contents(outer_list)
print('after, outer_list =', outer_list)

Output:

before, outer_list = ['one', 'two', 'three']
got ['one', 'two', 'three']
changed to ['one', 'two', 'three', 'four']
after, outer_list = ['one', 'two', 'three', 'four']

Since the parameter passed in is a reference to outer_list, not a copy of it, we can use the mutating list methods to change it and have the changes reflected in the outer scope.

Now let's see what happens when we try to change the reference that was passed in as a parameter:

def try_to_change_list_reference(the_list):
print('got', the_list)
the_list = ['and', 'we', 'can', 'not', 'lie']
print('set to', the_list)

outer_list = ['we', 'like', 'proper', 'English']

print('before, outer_list =', outer_list)
try_to_change_list_reference(outer_list)
print('after, outer_list =', outer_list)

Output:

before, outer_list = ['we', 'like', 'proper', 'English']
got ['we', 'like', 'proper', 'English']
set to ['and', 'we', 'can', 'not', 'lie']
after, outer_list = ['we', 'like', 'proper', 'English']

Since the the_list parameter was passed by value, assigning a new list to it had no effect that the code outside the method could see. The the_list was a copy of the outer_list reference, and we had the_list point to a new list, but there was no way to change where outer_list pointed.

String - an immutable type

It's immutable, so there's nothing we can do to change the contents of the string

Now, let's try to change the reference

def try_to_change_string_reference(the_string):
print('got', the_string)
the_string = 'In a kingdom by the sea'
print('set to', the_string)

outer_string = 'It was many and many a year ago'

print('before, outer_string =', outer_string)
try_to_change_string_reference(outer_string)
print('after, outer_string =', outer_string)

Output:

before, outer_string = It was many and many a year ago
got It was many and many a year ago
set to In a kingdom by the sea
after, outer_string = It was many and many a year ago

Again, since the the_string parameter was passed by value, assigning a new string to it had no effect that the code outside the method could see. The the_string was a copy of the outer_string reference, and we had the_string point to a new string, but there was no way to change where outer_string pointed.

I hope this clears things up a little.

EDIT: It's been noted that this doesn't answer the question that @David originally asked, "Is there something I can do to pass the variable by actual reference?". Let's work on that.

How do we get around this?

As @Andrea's answer shows, you could return the new value. This doesn't change the way things are passed in, but does let you get the information you want back out:

def return_a_whole_new_string(the_string):
new_string = something_to_do_with_the_old_string(the_string)
return new_string

# then you could call it like
my_string = return_a_whole_new_string(my_string)

If you really wanted to avoid using a return value, you could create a class to hold your value and pass it into the function or use an existing class, like a list:

def use_a_wrapper_to_simulate_pass_by_reference(stuff_to_change):
new_string = something_to_do_with_the_old_string(stuff_to_change[0])
stuff_to_change[0] = new_string

# then you could call it like
wrapper = [my_string]
use_a_wrapper_to_simulate_pass_by_reference(wrapper)

do_something_with(wrapper[0])

Although this seems a little cumbersome.

Are these arguments in Python being passed by value or by reference?

Python is pass by assignment. Within your BinaryTree._insertInternal the assignment of root argument (also the local variable within the scobe of that method) is initially assigned the value of the root node (in this case, the value is an object reference), and the statement root = Node(None, None, value) is a new assignment, thus it becomes different to the initially passed in thus different to the instance's self.root.

Python: passing functions as arguments to initialize the methods of an object. Pythonic or not?

Passing functions to an object is fine. There's nothing wrong with that design.

If you want to turn that function into a bound method, though, you have to be a little careful. If you do something like self.func = lambda x: func(self, x), you create a reference cycle - self has a reference to self.func, and the lambda stored in self.func has a reference to self. Python's garbage collector does detect reference cycles and cleans them up eventually, but that can sometimes take a long time. I've had reference cycles in my code in the past, and those programs often used upwards of 500 MB memory because python would not garbage collect unneeded objects often enough.

The correct solution is to use the weakref module to create a weak reference to self, for example like this:

import weakref

class WeakMethod:
def __init__(self, func, instance):
self.func = func
self.instance_ref = weakref.ref(instance)

self.__wrapped__ = func # this makes things like `inspect.signature` work

def __call__(self, *args, **kwargs):
instance = self.instance_ref()
return self.func(instance, *args, **kwargs)

def __repr__(self):
cls_name = type(self).__name__
return '{}({!r}, {!r})'.format(cls_name, self.func, self.instance_ref())

class FooBar(object):
def __init__(self, func, a):
self.a = a
self.func = WeakMethod(func, self)

f = FooBar(foo1, 7)
print(f.func(3)) # 21

All of the following solutions create a reference cycle and are therefore bad:

  • self.func = MethodType(func, self)
  • self.func = func.__get__(self, type(self))
  • self.func = functools.partial(func, self)

Python - how does passing values work?

Python uses a system sometimes called call-by-object. Nothing is copied when you pass arguments to a function. The names of the function arguments are locally bound within the function body, to the same objects provided in the function call.

This is different from what most people think of as "call by value", because it doesn't copy the objects. But it's also different from "call by reference" because the reference is to the object --- a new name is bound, but to the same object. This means that you can mutate the passed-in object, but rebinding the name inside the function has no effect outside the function. A simple example of the difference:

>>> def func(x):
... x[0] = 2 # Mutating the object affects the object outside the function
>>> myList = [1]
>>> func(myList)
>>> myList # myList has changed
[2]
>>> def func(x):
... x = 2 # rebinding name has no effect outside the function
>>> myList = [1]
>>> func(myList)
>>> myList # myList is unaffected
[1]

My simple way of thinking about this is that assignment to a bare name --- that is, statements of the form name = value --- is completely different from everything else in Python. The only way to operate on names and not on values is to do name = value. (There are nitpicky exceptions to this, like mucking around with globals() and so on, but these are dangerous territory anyway.) In particular name = value is different from obj.prop = value, obj[0] = value, obj += value, and other similar things that look like assignment but actually operate on objects and not on names.

That said, function calls in Python have a certain amount of overhead just in themselves (for setting up the execution frame, etc.). If a function is called many times, this overhead can cause a noticeable performance impact. So splitting one function into many could still have a performance impact, since each additional function call will add some overhead.

How can this be called Pass By Reference?

It is neither. It is call by sharing. I've also heard the term "pass by reference value" used.

Also known as "call by object" or "call by object-sharing," call by sharing is an evaluation strategy first named by Barbara Liskov et al. for the language CLU in 1974. It is used by languages such as Python, Iota, Java (for object references), Ruby, JavaScript, Scheme, OCaml, AppleScript, and many others. However, the term "call by sharing" is not in common use; the terminology is inconsistent across different sources. For example, in the Java community, they say that Java is call-by-value, whereas in the Ruby community, they say that Ruby is call-by-reference, even though the two languages exhibit the same semantics. Call by sharing implies that values in the language are based on objects rather than primitive types, i.e. that all values are "boxed".

The semantics of call by sharing differ from call by reference in that assignments to function arguments within the function aren't visible to the caller (unlike by reference semantics), so e.g. if a variable was passed, it is not possible to simulate an assignment on that variable in the caller's scope. However, since the function has access to the same object as the caller (no copy is made), mutations to those objects, if the objects are mutable, within the function are visible to the caller, which may appear to differ from call by value semantics. Mutations of a mutable object within the function are visible to the caller because the object is not copied or cloned — it is shared.



Related Topics



Leave a reply



Submit