Does a Slicing Operation Give Me a Deep or Shallow Copy

Does a slicing operation give me a deep or shallow copy?

You are creating a shallow copy, because nested values are not copied, merely referenced. A deep copy would create copies of the values referenced by the list too.

Demo:

>>> lst = [{}]
>>> lst_copy = lst[:]
>>> lst_copy[0]['foo'] = 'bar'
>>> lst_copy.append(42)
>>> lst
[{'foo': 'bar'}]
>>> id(lst) == id(lst_copy)
False
>>> id(lst[0]) == id(lst_copy[0])
True

Here the nested dictionary is not copied; it is merely referenced by both lists. The new element 42 is not shared.

Remember that everything in Python is an object, and names and list elements are merely references to those objects. A copy of a list creates a new outer list, but the new list merely receives references to the exact same objects.

A proper deep copy creates new copies of each and every object contained in the list, recursively:

>>> from copy import deepcopy
>>> lst_deepcopy = deepcopy(lst)
>>> id(lst_deepcopy[0]) == id(lst[0])
False

Does [:] slice only make shallow copy of a list?

It is a shallow copy, but changing b does not affect a in this case because the elements are just numbers. If they were references then a would be updated:

a = [1, 2, 3]
b = a[:]

b[1] = 5
print "a: ", a
print "b: ", b
# a: [1, 2, 3]
# b: [1, 5, 3]

vs

a = [[1], [2], [3]]
b = a[:]

b[1][0] = 5
print "a: ", a
print "b: ", b
# a: [[1], [5], [3]]
# b: [[1], [5], [3]]

Array slicing seems like it's making a deep copy?

You are making shallow copies of the inner lists (i.e. the rows), which is effectively the same as a deep copy of the outer list if the inner lists are just lists of int objects.

You've essentially implemented a deep copy function for the special case of a list of lists of integers.

Using copy.deepcopy will be slower because that function will have to investigate and cache all id's of the objects, including the int objects. Your snippet isn't doing that, but in this particular case, it doesn't matter (note, small int objects are cached at the interpreter level, they are essentially singletons, and anyway, int objects are immutable so they don't really have to be copied at all).

Here's a link to the copy module source code if you eant to see exactly what is involved in a generic deep-copy.

List slicing vs copying

L.copy() and L[:] work identically - both are shallow copies. At first only L[:] existed; .copy() was added later so that generic code needing a copy could spell it in a unform way (dict.copy(), set.copy(), ...).

Examples

>>> L = [[1, 2], [3, 4]]
>>> L1 = L[:]
>>> [a is b for a, b in zip(L, L1)]
[True, True]
>>> L1 = L.copy()
>>> [a is b for a, b in zip(L, L1)]
[True, True]
>>> import copy
>>> L1 = copy.copy(L)
>>> [a is b for a, b in zip(L, L1)]
[True, True]
>>> L1 = copy.deepcopy(L) # this one differs!
>>> [a is b for a, b in zip(L, L1)]
[False, False]

When does slicing operator create a shallow copy in Python?

del L[:] is a distinct operation from accessing L[:], which is again a distinct operation from L[:] = x.

  • del L[:] calls __delitem__ on the object with a slice object.
  • L[:] calls __getitem__ with that slice object.
  • L[:] = x calls __setitem__ with the slice object and x.

These three operations can be implemented in very different ways, depending on what the object is. For built-in list types, __delitem__ erases the items specified in the slice, __setitem__ replaces the items with the items given, and __getitem__ returns a new (copied) list consisting of the elements specified.

However, not all objects have to behave this way. For example, with a NumPy array, __getitem__ with a slice returns a view of the array rather than a copy - modifying the view alters the original array.

Trying to understand slices of lists

From the documentation:

All slice operations return a new list containing the requested elements. This means that the following slice returns a new (shallow) copy of the list.

So basically when you do l[0:1][0] = 13 you're assigning 13 as the value in a new list, not l. It's the same as if you did

[l[0]] = 13

or

g = [l[0]]
g[0] = 13

Note this is only true for immutable types, such as int and str. Since a slice performs a shallow copy, you'd get the behavior you'd expect if you modify an object.

>>> l = [{'hi': 7}, {}, {}, {}]
[{'hi': 7}, {}, {}, {}]
>>> l[0:1][0]['hi'] = 1
>>> l[0]
{'hi': 1}

If Python slice copy the reference, why can't I use it to modify the original list?

You are right that slicing doesn't copy the items in the list. However, it does create a new list object.

Your comment suggests a misunderstanding:

# Attempting to modify the element at index 1
l[0:2][-1] = 10

This is not a modification of the element, it's a modification of the list. In other words it is really "change the list so that index 1 now points to the number 10". Since your slice created a new list, you are just changing that new list to point at some other object.

In your comment to oldrinb's answer, you said:

Why are l[0:1] and l[0:1][0] different? Shouldn't they both refer to the same object, i.e. the first item of l?

Aside from the fact that l[0:1] is a list while l[0:1][0] is a single element, there is again the same misunderstanding here. Suppose that some_list is a list and the object at index ix is obj. This:

some_list[ix] = blah

. . . is an operation on some_list. The object obj is not involved. This can be confusing because it means some_list[ix] has slightly different semantics depending on which side of the assignment it is on. If you do

blah = some_list[ix] + 2

. . .then you are indeed operating on the object inside the list (i.e., it is the same as obj + 2). But when the indexing operation is on the left of the assignment, it no longer involves the contained object at all, only the list itself.

When you assign to a list index you are modifying the list, not the object inside it. So in your example l[0] is the same as l[0:2][0], but that doesn't matter; because your indexing is an assignment target, it's modifying the list and doesn't care what object was in there already.

Different slicing behaviors on left/right hand side of assignment operator

Python operators are best considered as syntactic sugar for "magic" methods; for example, x + y is evaluated as x.__add__(y). In the same way that:

  • foo = bar.baz becomes foo = bar.__getattr__(baz); whereas
  • bar.baz = foo becomes bar.__setattr__(baz, foo);

the Python "slicing operator" * a[b] is evaluated as either:

  • a.__getitem__(b); or
  • a.__setitem__(b, ...);

depending on which side of the assignment it's on; the two aren't quite the same (see also How assignment works with python list slice). Written out in "longhand", therefore:

>>> x = [1, 2, 3]
>>> x.__getitem__(slice(None)) # ... = x[:]
[1, 2, 3]
>>> x.__setitem__(slice(None), (4, 5, 6)) # x[:] = ...
>>> x
[4, 5, 6]

The data model documentation explains these methods in more detail (e.g. __getitem__), and you can read the docs on slice, too.


Note that the slice is a shallow copy, not a deep one, as the following demonstrates:

>>> foo = [[], []]
>>> bar = foo[:]
>>> bar is foo
False # outer list is new object
>>> bar[0] is foo[0]
True # inner lists are same objects
>>> bar[0].append(1)
>>> foo
[[1], []]

* Well, not strictly an operator.



Related Topics



Leave a reply



Submit