Deep Copy VS Shallow Copy

What is the difference between a deep copy and a shallow copy?

Shallow copies duplicate as little as possible. A shallow copy of a collection is a copy of the collection structure, not the elements. With a shallow copy, two collections now share the individual elements.

Deep copies duplicate everything. A deep copy of a collection is two collections with all of the elements in the original collection duplicated.

Why is a deep copy so much slower than a shallow copy for lists of the same size?

deepcopy isn't copying the ints. There's no way it could do that anyway.

deepcopy is slow because it needs to handle the full complexity of a deep copy, even if that turns out to be unnecessary. That includes dispatching to the appropriate copier for every object it finds, even if the copier turns out to basically just be lambda x: x. That includes maintaining a memo dict and keeping track of every object copied, to handle duplicate references to the same objects, even if there are none. That includes special copy handling for data structures like lists and dicts, so it doesn't go into an infinite recursion when trying to copy a data structure with recursive references.

All of that has to be done no matter whether it pays off. All of it is expensive.

Also, deepcopy is pure-Python. That doesn't help. Comparing deepcopy to pickle.loads(pickle.dumps(whatever)), which performs a very similar job, pickle wins handily due to the C implementation. (On Python 2, replace pickle with cPickle.) pickle still loses hard to an implementation that takes advantage of the known structure of the input, though:

In [15]: x = [[0]*1000 for i in range(1000)]

In [16]: %timeit copy.deepcopy(x)
1.05 s ± 5.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [17]: %timeit pickle.loads(pickle.dumps(x))
78 ms ± 4.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [18]: %timeit [l[:] for l in x]
4.56 ms ± 108 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

What is the difference between shallow copy, deepcopy and normal assignment operation?

Normal assignment operations will simply point the new variable towards the existing object. The docs explain the difference between shallow and deep copies:

The difference between shallow and deep copying is only relevant for
compound objects (objects that contain other objects, like lists or
class instances):

  • A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.

  • A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the
    original.

Here's a little demonstration:

import copy

a = [1, 2, 3]
b = [4, 5, 6]
c = [a, b]

Using normal assignment operatings to copy:

d = c

print id(c) == id(d) # True - d is the same object as c
print id(c[0]) == id(d[0]) # True - d[0] is the same object as c[0]

Using a shallow copy:

d = copy.copy(c)

print id(c) == id(d) # False - d is now a new object
print id(c[0]) == id(d[0]) # True - d[0] is the same object as c[0]

Using a deep copy:

d = copy.deepcopy(c)

print id(c) == id(d) # False - d is now a new object
print id(c[0]) == id(d[0]) # False - d[0] is now a new object

Why there is no difference between shallow copy and deep copy for a list of immutables

Since you cannot change the immutable objects, there is no point in creating copies of the same while copying.

Shallow Copy

As per copy's source code, shallow copy of immutable types is done like this

def _copy_immutable(x):
return x

for t in (type(None), int, long, float, bool, str, tuple,
frozenset, type, xrange, types.ClassType,
types.BuiltinFunctionType, type(Ellipsis),
types.FunctionType, weakref.ref):
d[t] = _copy_immutable

For all the immutable types, the _copy_immutable function returns the object as it is, during shallow copy.

Deep Copy

The same way, during the deepcopy of tuples, the object is returned as is, as per the _deepcopy_tuple function,

d = id(x)
try:
return memo[d]

Is pass-by-value/reference equivalent to making a deep/shallow copy, respectively?

No. Those two things are completely unrelated.

Shallow copy/deep copy is talking about object copying; whereas pass-by-value/pass-by-reference is talking about the passing of variables.

In many modern languages, like Python (which you mentioned that you're most familiar with) and Java, "objects" are not values in the language, so "objects" cannot be assigned or passed. Rather, objects are always manipulated through pointers to objects (references), which are values and can be assigned or passed.

Python and Java are pass-by-value only. When you pass a reference, it copies the pointer and you end up with two pointers to the same object. No object copying happens. In these languages, object copying is not done through assigning or passing, but rather is done by calling special methods like .clone() or by passing an object to a constructor to construct a new object. (There is in fact no general way to copy an object in Java.)

There are some languages, like C++, where objects can be values (of course, you can also have pointers to objects, which work similarly to references in other languages). C also has both pass-by-value and pass-by-reference. If you pass an object by reference, no copying happens. If you assign or pass an object by value, the object is copied. But by default this is a shallow copy.

The distinction between shallow copy and deep copy is how they deal with members of the object which are pointers to another object. Members which are not pointers are simply copied; there is no concept of "shallow" or "deep". "Shallow" or "deep" only talks about members which are pointers -- whether to copy the object pointed to or not. The default assignment operator and copy constructor in C++ simply copies each member. For members that are pointers, that means the pointer is copied and thus you have two pointers to the same object. That is a shallow copy.

You might want a "deep" copy in cases where the member that is a pointer to another object is really pointing to a "sub-object" that is really "a part of" your object (whether a pointer to another object means a sub-object or not depends on the design of your objects), so you don't want multiple of your objects pointing to the same sub-object, and that's why you want to copy the sub-object when copying the main object. To implement this in C++, you would need to implement a custom assignment operator and custom copy constructor to copy the "sub-objects" in the process of copying. In other languages, you would similarly need to customize whatever copying method is used.

When does javascript shallow copy vs. deep copy

https://www.freecodecamp.org/news/copying-stuff-in-javascript-how-to-differentiate-between-deep-and-shallow-copies-b6d8c1ef09cd/

Hi everyone. Here is the answer to this question. Thanks!



Related Topics



Leave a reply



Submit