Python List by Value Not by Reference

python list by value not by reference

You cannot pass anything by value in Python. If you want to make a copy of a, you can do so explicitly, as described in the official Python FAQ:

b = a[:]

Copy a list of list by value and not reference

Because python passes lists by reference

This means that when you write "b=a" you're saying that a and b are the same object, and that when you change b you change also a, and viceversa

A way to copy a list by value:

new_list = old_list[:]

If the list contains objects and you want to copy them as well, use generic copy.deepcopy():

import copy
new_list = copy.deepcopy(old_list)

how to pass a list as value and not as reference?

Create a copy of the list:

def change_2(v):
v = v[:]
v[2] = 6
return v

Python is already giving you the value, the object itself. You are altering that value by changing its contents.

v[:] takes a slice of all indices in the list, creating a new list object to hold these indices. An alternative way would be to use the list() constructor:

v = list(v)

Another alternative is to leave the responsibility of creating a copy up to the caller:

def change_2(v):
v[2] = 6
return v

x = [1,2,3]
z = change_2(x[:])

By creating a full copy first, you now have a different value.

List of same value, not same reference

It's a bit strange to need two different-by-identity but equal-by-value tuples, but I've had legitimate reasons to do similarly strange things on occasion, so I think the question deserves an answer at face value.

The reason that y gives a list of three references to the same tuple is that the tuple (None, None) is a compile-time constant, so the bytecode for it is a simple LOAD_CONST which is done inside the list comprehension, and of course loading the same constant three times just creates three references to it, not three copies.

To get around this, we need an expression whose value is (None, None) but for which the expression is not a compile-time constant. A function call to tuple does it, at least in the versions of Python I tested (3.5.2 and 3.8.1):

[tuple([None, None]) for _ in range(3)]

Frankly I'm a little surprised that x does have three different copies, and that is almost certainly an implementation-specific detail that you shouldn't rely on. But likewise, tuple.__new__ does not always create a new tuple; for example tuple( (None, None) ) returns a reference to the actual argument, not a copy of it. So it is not guaranteed that tuple([None, None]) will continue to produce non-equal-by-reference tuples in other versions of Python.

Arguments are passed by reference or not in Python

list = [element] + list creates a new list and overwrites the original value of, um, list. I doesn't add element to the existing list so it doesn't demonstrate pass by reference. It is equivalent to:

list2 = [element] + list
list = list2

The following demonstrates pass by reference by adding to the existing list instead of creating a new one.

def prepend(element, _list):
_list.insert(0, element)

_try = [1,2,3]
prepend(0, _try)
print(_try)

UPDATE

It may be more clear if I add print statements that show how the variables change as the program executes. There are two versions of prepend, one that creates a new object and another that updates an existing object. The id() function returns a unique identifier for the object (in cpython, the memory address of the object).

def prepend_1(element, _list):
print 'prepend_1 creates a new list, assigns it to _list and forgets the original'
print '_list refers to the same object as _try -->', id(_list), _list
_list = [element] + _list
print '_list now refers to a different object -->', id(_list), _list

def prepend_2(element, _list):
print 'prepend_2 updates the existing list'
print '_list refers to the same object as _try -->', id(_list), _list
_list.insert(0, element)
print '_list still refers to the same object as _try -->', id(_list), _list

_try = [1,2,3]
print '_try is assigned -->', id(_try), _try
prepend_1(0, _try)
print '_try is the same object and is not updated -->', id(_try), _try
prepend_2(0, _try)
print '_try is the same object and is updated -->', id(_try), _try
print _try

When I run it, you can see how the objects relate to the variables that reference them

_try is assigned --> 18234472 [1, 2, 3]
prepend_1 creates a new list, assigns it to _list and forgets the original
_list refers to the same object as _try --> 18234472 [1, 2, 3]
_list now refers to --> 18372440 [0, 1, 2, 3]
_try is the same object and is not updated --> 18234472 [1, 2, 3]
prepend_2 updates the existing list
_list refers to the same object as _try --> 18234472 [1, 2, 3]
_list still refers to the same object as _try --> 18234472 [0, 1, 2, 3]
_try is the same object and is updated --> 18234472 [0, 1, 2, 3]
[0, 1, 2, 3]

Python: create a function to modify a list by reference not value

Python passes everything the same way, but calling it "by value" or "by reference" will not clear everything up, since Python's semantics are different than the languages for which those terms usually apply. If I was to describe it, I would say that all passing was by value, and that the value was an object reference. (This is why I didn't want to say it!)

If you want to filter out some stuff from a list, you build a new list

foo = range(100000)
new_foo = []
for item in foo:
if item % 3 != 0: # Things divisble by 3 don't get through
new_foo.append(item)

or, using the list comprehension syntax

 new_foo = [item for item in foo if item % 3 != 0]

Python will not copy the objects in the list, but rather both foo and new_foo will reference the same objects. (Python never implicitly copies any objects.)


You have suggested you have performance concerns about this operation. Using repeated del statements from the old list will result in not code that is less idiomatic and more confusing to deal with, but it will introduce quadratic performance because the whole list must be reshuffled each time.

To address performance:

  • Get it up and running. You can't figure out what your performance is like unless you have code working. This will also tell you whether it is speed or space that you must optimize for; you mention concerns about both in your code, but oftentimes optimization involves getting one at the cost of the other.

  • Profile. You can use the stdlib tools for performance in time. There are various third-party memory profilers that can be somewhat useful but aren't quite as nice to work with.

  • Measure. Time or reprofile memory when you make a change to see if a change makes an improvement and if so what that improvement is.

  • To make your code more memory-sensitive, you will often want a paradigm shift in how you store your data, not microoptimizastions like not building a second list to do filtering. (The same is true for time, really: changing to a better algorithm will almost always give the best speedup. However, it's harder to generalize about speed optimizations).

    Some common paradigm shifts to optimize memory consumption in Python include

    1. Using Generators. Generators are lazy iterables: they don't load a whole list into memory at once, they figure out what their next items are on the fly. To use generators, the snippets above would look like

      foo = xrange(100000) # Like generators, xrange is lazy
      def filter_divisible_by_three(iterable):
      for item in foo:
      if item % 3 != 0:
      yield item

      new_foo = filter_divisible_by_three(foo)

      or, using the generator expression syntax,

      new_foo = (item for item in foo if item % 3 != 0)
    2. Using numpy for homogenous sequences, especially ones that are numerical-mathy. This can also speed up code that does lots of vector operations.

    3. Storing data to disk, such as in a database.



Related Topics



Leave a reply



Submit