Shallow Copy or Deep Copy

What is the difference between a deep copy and a shallow copy?

Shallow copies duplicate as little as possible. A shallow copy of a collection is a copy of the collection structure, not the elements. With a shallow copy, two collections now share the individual elements.

Deep copies duplicate everything. A deep copy of a collection is two collections with all of the elements in the original collection duplicated.

Why there is no difference between shallow copy and deep copy for a list of immutables

Since you cannot change the immutable objects, there is no point in creating copies of the same while copying.

Shallow Copy

As per copy's source code, shallow copy of immutable types is done like this

def _copy_immutable(x):
return x

for t in (type(None), int, long, float, bool, str, tuple,
frozenset, type, xrange, types.ClassType,
types.BuiltinFunctionType, type(Ellipsis),
types.FunctionType, weakref.ref):
d[t] = _copy_immutable

For all the immutable types, the _copy_immutable function returns the object as it is, during shallow copy.

Deep Copy

The same way, during the deepcopy of tuples, the object is returned as is, as per the _deepcopy_tuple function,

d = id(x)
try:
return memo[d]

Why is a deep copy so much slower than a shallow copy for lists of the same size?

deepcopy isn't copying the ints. There's no way it could do that anyway.

deepcopy is slow because it needs to handle the full complexity of a deep copy, even if that turns out to be unnecessary. That includes dispatching to the appropriate copier for every object it finds, even if the copier turns out to basically just be lambda x: x. That includes maintaining a memo dict and keeping track of every object copied, to handle duplicate references to the same objects, even if there are none. That includes special copy handling for data structures like lists and dicts, so it doesn't go into an infinite recursion when trying to copy a data structure with recursive references.

All of that has to be done no matter whether it pays off. All of it is expensive.

Also, deepcopy is pure-Python. That doesn't help. Comparing deepcopy to pickle.loads(pickle.dumps(whatever)), which performs a very similar job, pickle wins handily due to the C implementation. (On Python 2, replace pickle with cPickle.) pickle still loses hard to an implementation that takes advantage of the known structure of the input, though:

In [15]: x = [[0]*1000 for i in range(1000)]

In [16]: %timeit copy.deepcopy(x)
1.05 s ± 5.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [17]: %timeit pickle.loads(pickle.dumps(x))
78 ms ± 4.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [18]: %timeit [l[:] for l in x]
4.56 ms ± 108 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Java: Why String is special in Deep Copy and Shallow copy?

You start with this situation:

Sample Image

Then you clone the User object. You now have two User objects; the variables user and clone refer to those two objects. Note that both their name member variables refer to the same String object, with the content "one".

Sample Image

Then you call setName("two") on clone, which will change the name member variable of the second User object to refer to a different String object, with the content "two".

Sample Image

Note that the variable user still refers to the User object which has its name member variable referring to "one", so when you System.out.println(user.getName()); the result is one.

What is the difference between shallow copy, deepcopy and normal assignment operation?

Normal assignment operations will simply point the new variable towards the existing object. The docs explain the difference between shallow and deep copies:

The difference between shallow and deep copying is only relevant for
compound objects (objects that contain other objects, like lists or
class instances):

  • A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.

  • A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the
    original.

Here's a little demonstration:

import copy

a = [1, 2, 3]
b = [4, 5, 6]
c = [a, b]

Using normal assignment operatings to copy:

d = c

print id(c) == id(d) # True - d is the same object as c
print id(c[0]) == id(d[0]) # True - d[0] is the same object as c[0]

Using a shallow copy:

d = copy.copy(c)

print id(c) == id(d) # False - d is now a new object
print id(c[0]) == id(d[0]) # True - d[0] is the same object as c[0]

Using a deep copy:

d = copy.deepcopy(c)

print id(c) == id(d) # False - d is now a new object
print id(c[0]) == id(d[0]) # False - d[0] is now a new object


Related Topics



Leave a reply



Submit