How to Understand the #Dup and #Clone Operate on Objects Which Referencing Other Objects

Both dup and clone return different objects, but modifying them alters the original object

dupand clone make new instances of the arrays, but not of the content, it is no deep copy.

See:

array0 = ['stuff', 'things']
array1 = array0.clone
array2 = array0.dup

puts "Array-Ids"
p array0.object_id
p array1.object_id
p array2.object_id

puts "Object ids"
array0.each_with_index{|_,i|
p array0[i].object_id
p array1[i].object_id
p array2[i].object_id
p '--------'
}

The elements inside the array share the same object_id - they are the same object. The arrays have different object ids.

When you a[0].capitalize! you modify an object, which is part in three different arrays.

See also

  • Duplicating a Ruby array of strings
  • Deep copy of arrays in Ruby
  • How to create a deep copy of an object in Ruby?

What's the difference between Ruby's dup and clone methods?

Subclasses may override these methods to provide different semantics. In Object itself, there are two key differences.

First, clone copies the singleton class, while dup does not.

o = Object.new
def o.foo
42
end

o.dup.foo # raises NoMethodError
o.clone.foo # returns 42

Second, clone preserves the frozen state, while dup does not.

class Foo
attr_accessor :bar
end
o = Foo.new
o.freeze

o.dup.bar = 10 # succeeds
o.clone.bar = 10 # raises RuntimeError

The Rubinius implementation for these methods
is often my source for answers to these questions, since it is quite clear, and a fairly compliant Ruby implementation.

Provide simplest example where deep copy is needed in ruby

The example you have shown does not describe the difference between a deep and a shallow copy. Instead, consider this example:

class Klass
attr_accessor :name
end

anna = Klass.new
anna.name = 'Anna'

anna_lisa = anna.dup
anna_lisa.name << ' Lisa'
# => "Anna Lisa"

anna.name
# => "Anna Lisa"

Generally, dup and clone are both expected to just duplicate the actual object you are calling the method on. No other referenced objects like the name String in the above example are duplicated. Thus, after the duplication, both, the original and the duplicated object point to the very same name string.

With a deep_dup, typically all (relevant) referenced objects are duplicated too, often to an infinite depth. Since this is rather hard to achieve for all possible object references, often people rely on implementation for specific objects like hashes and arrays.

A common workaround for a rather generic deep-dup is to use Ruby's Marshal class to serialize an object graph and directly unserializing it again.

anna_lena = Marshal.load( Marshal.dump(anna))

This creates new objects and is effectively a deep_dup. Since most objects support marshaling right away, this is a rather powerful mechanism. Note though than you should never unmarshal (i.e. load) user-provided data since this will lead to a remote-code execution vulnerability.

Is .dup really creating a shallow copy?

You forget that == is actually a method from BasicObject:

obj == other → true or false
Equality — At the Object level, == returns true only if obj and other are the same object. Typically, this method is overridden in descendant classes to provide class-specific meaning.

So if you haven't provided your own implementation of == (i.e. a MyObject#== method) then your:

p myObject1 == myObject1.dup

is pretty much the same as saying:

p myObject1.object_id == myObject1.dup.object_id

and since myObject1.dup is a shallow copy of myObject1 (i.e. they're different objects), you get false.

When they say:

instance variables of obj are copied

they're referring to the instance variables inside obj, not variables that happen to reference obj. Your myObject1 isn't an instance variable in anything, it is just a variable, instance variables are referenced with a leading @ as in @my_instance_variable.

If you want == to behave they way you expect it to then you have to provide your own == implementation:

class MyObject
def ==(other)
# Check that the contents of `self` and `other` are the same
# and probably that `other.is_a?(MyObject)` first.
end
end

When to use dup, and when to use clone in Ruby?

It is true that clone copies the frozen state of an object, while dup does not:

o = Object.new
o.freeze

o.clone.frozen?
#=> true

o.dup.frozen?
#=> false

clone will also copy the singleton methods of the object while dup does not:

o = Object.new
def o.foo
42
end

o.clone.respond_to?(:foo)
#=> true

o.dup.respond_to?(:foo)
#=> false

Which leads me to the assumption that clone is sometimes understood as to provide a "deeper" copy than dup. Here are some quotes about the topic:

Comment on ActiveRecord::Base#initialize_dup from Rails 3:

Duped objects have no id assigned and are treated as new records. Note
that this is a "shallow" copy as it copies the object's attributes
only, not its associations. The extent of a "deep" copy is application
specific and is therefore left to the application to implement according
to its need.

An article about deep copies in Ruby:

There is another method worth mentioning, clone. The clone method does the same thing as dup with one important distinction: it's expected that objects will override this method with one that can do deep copies.

But then again, theres deep_dup in Rails 4:

Returns a deep copy of object if it's duplicable. If it's not duplicable, returns self.

and also ActiveRecord::Core#dup and #clone in Rails 4:

clone — Identical to Ruby's clone method. This is a "shallow" copy. Be warned that your attributes are not copied. [...] If you need a copy of your attributes hash, please use the #dup method.

Which means that here, the word dup is used to refer to a deep clone again. As far as I can see, there seems to be no consensus in the community, except that you should use clone and dup in the case when you need a specific side effect of either one.

Finally, I see dup much more often in Ruby code than clone. I have never used clone so far, and I won't until I explicitly need to.

What is the difference between a deep copy and a shallow copy?

Shallow copies duplicate as little as possible. A shallow copy of a collection is a copy of the collection structure, not the elements. With a shallow copy, two collections now share the individual elements.

Deep copies duplicate everything. A deep copy of a collection is two collections with all of the elements in the original collection duplicated.

Failing to get an actual copy of nested array of hashes

#dup produces a shallow copy of an object.

You could use marshalization:

a = [[{x: 2}, {y: 4}], [{z: 8}]]
b = Marshal.load(Marshal.dump(a))
b[0].delete_at(0)
puts a.to_s #=> [[{:x=>2}, {:y=>4}], [{:z=>8}]]
puts b.to_s #=> [[{:y=>4}], [{:z=>8}]]

Which method to define on a Ruby class to provide dup / clone for its instances?

From my experience, overloading #initialize_copy works just fine (never heard about initialize_dup and initialize_clone).

The original initialize_copy (which initializes every instance variable with the values from the original object) is available through super, so I usually do:

class MyClass
def initialize_copy(orig)
super
# Do custom initialization for self
end
end

What is the most efficient way to deep clone an object in JavaScript?

Native deep cloning

There's now a JS standard called "structured cloning", that works experimentally in Node 11 and later, will land in browsers, and which has polyfills for existing systems.

structuredClone(value)

If needed, loading the polyfill first:

import structuredClone from '@ungap/structured-clone';

See this answer for more details.

Older answers

Fast cloning with data loss - JSON.parse/stringify

If you do not use Dates, functions, undefined, Infinity, RegExps, Maps, Sets, Blobs, FileLists, ImageDatas, sparse Arrays, Typed Arrays or other complex types within your object, a very simple one liner to deep clone an object is:

JSON.parse(JSON.stringify(object))

const a = {
string: 'string',
number: 123,
bool: false,
nul: null,
date: new Date(), // stringified
undef: undefined, // lost
inf: Infinity, // forced to 'null'
re: /.*/, // lost
}
console.log(a);
console.log(typeof a.date); // Date object
const clone = JSON.parse(JSON.stringify(a));
console.log(clone);
console.log(typeof clone.date); // result of .toISOString()


Related Topics



Leave a reply



Submit