Deep Copy of Arrays in Ruby

Cloning an array with its content

You need to do a deep copy of your array.

Here is the way to do it

Marshal.load(Marshal.dump(a))

This is because you are cloning the array but not the elements inside. So the array object is different but the elements it contains are the same instances. You could, for example, also do a.each{|e| b << e.dup} for your case

How to create a deep copy of an object in Ruby?

Deep copy isn't built into vanilla Ruby, but you can hack it by marshalling and unmarshalling the object:

Marshal.load(Marshal.dump(@object))

This isn't perfect though, and won't work for all objects. A more robust method:

class Object
def deep_clone
return @deep_cloning_obj if @deep_cloning
@deep_cloning_obj = clone
@deep_cloning_obj.instance_variables.each do |var|
val = @deep_cloning_obj.instance_variable_get(var)
begin
@deep_cloning = true
val = val.deep_clone
rescue TypeError
next
ensure
@deep_cloning = false
end
@deep_cloning_obj.instance_variable_set(var, val)
end
deep_cloning_obj = @deep_cloning_obj
@deep_cloning_obj = nil
deep_cloning_obj
end
end

Source:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/43424

Is there a simple way to duplicate a multi-dimensional array in Ruby?

Here's the "Ruby-esque" way to handle it:

temp_array = Marshal.load(Marshal.dump(your_array_to_be_cloned))

Duplicating a Ruby array of strings

Your second solution can be shortened to arr2 = arr.map do |e| e.dup end (unless you actually need the behaviour of clone, it's recommended to use dup instead).

Other than that your two solutions are basically the standard solutions to perform a deep copy (though the second version is only one-level deep (i.e. if you use it on an array of arrays of strings, you can still mutate the strings)). There isn't really a nicer way.

Edit: Here's a recursive deep_dup method that works with arbitrarily nested arrays:

class Array
def deep_dup
map {|x| x.deep_dup}
end
end

class Object
def deep_dup
dup
end
end

class Numeric
# We need this because number.dup throws an exception
# We also need the same definition for Symbol, TrueClass and FalseClass
def deep_dup
self
end
end

You might also want to define deep_dup for other containers (like Hash), otherwise you'll still get a shallow copy for those.

Use [].replace to make a copy of an array

This is the tricky concept of mutability in ruby. In terms of core objects, this usually comes up with arrays and hashes. Strings are mutable as well, but this can be disabled with a flag at the top of the script. See What does the comment "frozen_string_literal: true" do?.

In this case, you can call dup, deep_dup, clone easily to the same effect as replace:

['some', 'array'].dup
['some', 'array'].deep_dup
['some', 'array'].clone
Marshal.load Marshal::dump(['some', 'array'])

In terms of differences, dup and clone are the same except for some nuanced details - see What's the difference between Ruby's dup and clone methods?

The difference between these and deep_dup is that deep_dup works recursively. For example if you dup a nested array, the inner array will not be cloned:

  a = [[1]]
b = a.clone
b[0][0] = 2
a # => [[2]]

The same thing happens with hashes.

Marshal.load Marshal::dump <object> is a general approach to deep cloning objects, which, unlike deep_dup, is in ruby core. Marshal::dump returns a string so it can be handy in serializing objects to file.

If you want to avoid unexpected errors like this, keep a mental index of which methods have side-effects and only call those when it makes sense to. An explanation point at the end of a method name indicates that it has side effects, but others include unshift, push, concat, delete, and pop. A big part of fuctional programming is avoiding side effects. You can see https://www.sitepoint.com/functional-programming-techniques-with-ruby-part-i/

What's the most efficient way to deep copy an object in Ruby?

I was wondering the same thing, so I benchmarked a few different techniques against each other. I was primarily concerned with Arrays and Hashes - I didn't test any complex objects. Perhaps unsurprisingly, a custom deep-clone implementation proved to be the fastest. If you are looking for quick and easy implementation, Marshal appears to be the way to go.

I also benchmarked an XML solution with Rails 3.0.7, not shown below. It was much, much slower, ~10 seconds for only 1000 iterations (the solutions below all ran 10,000 times for the benchmark).

Two notes regarding my JSON solution. First, I used the C variant, version 1.4.3. Second, it doesn't actually work 100%, as symbols will be converted to Strings.

This was all run with ruby 1.9.2p180.

#!/usr/bin/env ruby
require 'benchmark'
require 'yaml'
require 'json/ext'
require 'msgpack'

def dc1(value)
Marshal.load(Marshal.dump(value))
end

def dc2(value)
YAML.load(YAML.dump(value))
end

def dc3(value)
JSON.load(JSON.dump(value))
end

def dc4(value)
if value.is_a?(Hash)
result = value.clone
value.each{|k, v| result[k] = dc4(v)}
result
elsif value.is_a?(Array)
result = value.clone
result.clear
value.each{|v| result << dc4(v)}
result
else
value
end
end

def dc5(value)
MessagePack.unpack(value.to_msgpack)
end

value = {'a' => {:x => [1, [nil, 'b'], {'a' => 1}]}, 'b' => ['z']}

Benchmark.bm do |x|
iterations = 10000
x.report {iterations.times {dc1(value)}}
x.report {iterations.times {dc2(value)}}
x.report {iterations.times {dc3(value)}}
x.report {iterations.times {dc4(value)}}
x.report {iterations.times {dc5(value)}}
end

results in:

user       system     total       real
0.230000 0.000000 0.230000 ( 0.239257) (Marshal)
3.240000 0.030000 3.270000 ( 3.262255) (YAML)
0.590000 0.010000 0.600000 ( 0.601693) (JSON)
0.060000 0.000000 0.060000 ( 0.067661) (Custom)
0.090000 0.010000 0.100000 ( 0.097705) (MessagePack)

Ruby: How to copy the multidimensional array in new array?

dup does not create a deep copy, it copies only the outermost object. From that docs:

Produces a shallow copy of obj—the instance variables of obj are copied, but not the objects they reference. dup copies the tainted state of obj.

If you are not sure how deep your object might be nested then the easiest way to create deep copy might be to serialize and de-serialize the object:

@@current_state = Marshal.load(Marshal.dump(seating_arrangement))

How to clone array of hashes and add key value using each loop

Let's see what is happening.

arr = [{a: "cat", b: "dog"}, {a: "uno", b: "due"}]
arr.object_id
#=> 4557280

arr1 = arr
arr1.object_id
#=> 4557280

As you see, the variables arr and arr1 hold the same object, because the objects have the same object id.1 Therefore, if that object is modified, arr and arr1 will still both hold that object. Let's try it.

arr[0] = {a: "cat", b: "dog"}
arr
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"uno", :b=>"due"}]
arr.object_id
#=> 4557280

arr1
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"uno", :b=>"due"}]
arr1.object_id
#=> 4557280

If we want to be able to modify arr in this way without it affecting arr1, we use the method Kernel#dup.

arr
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"uno", :b=>"due"}]
arr1 = arr.dup
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"uno", :b=>"due"}]

arr.object_id
#=> 4557280
arr1.object_id
#=> 3693480

arr.map(&:object_id)
#=> [2631980, 4557300]
arr1.map(&:object_id)
#=> [2631980, 4557300]

As you see, arr and arr1 now hold different objects. Those objects, however, are arrays whose corresponding elements (hashes) are the same objects. Let's modify one of arr's elements.

arr[1][:a] = "owl"
arr
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"owl", :b=>"due"}]
arr.map(&:object_id)
#=> [2631980, 4557300]

arr still contains the same objects, but we have modified one. Let's look at arr1.

arr1
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"owl", :b=>"due"}]
arr1.map(&:object_id)
#=> [2631980, 4557300]

Should we be surprised that arr1 has changed as well?

We need to dup both arr and the elements of arr.

arr = [{a: "one", b: "two"}, {a: "uno", b: "due"}]
arr1 = arr.dup.map(&:dup)
#=> [{:a=>"one", :b=>"two"}, {:a=>"uno", :b=>"due"}]

arr.object_id
#=> 4149120
arr1.object_id
#=> 4182360

arr.map(&:object_id)
#=> [4149200, 4149140]
arr1.map(&:object_id)
#=> [4182340, 4182280]

Now arr and arr1 are different objects and they contain different (hash) objects, so any change to one will not affect the other. (Try it.)

Now suppose arr were as follows.

arr = [{a: "cat", b: [1,2]}]

Let's make the copy.

arr1 = arr.dup.map(&:dup)
#=> [{:a=>"cat", :b=>[1, 2]}]

Now modify arr[0][:b].

arr[0][:b] << 3
#=> [{:a=>"cat", :b=>[1, 2, 3]}]
arr1
#=> [{:a=>"cat", :b=>[1, 2, 3]}]

Drat! arr1 changed. We can again look at object ids to see why that happened.

arr.object_id
#=> 4488500
arr1.object_id
#=> 4503140

arr.map(&:object_id)
#=> [4488520]
arr1.map(&:object_id)
#=> [4503100]

arr[0][:b].object_id
#=> 4488560
arr1[0][:b].object_id
#=> 4488560

We see that arr and arr1 are different objects and there respective hashes are the same elements, but the array is the same object for both hashes. We therefore need to do something like this:

arr1[0][:b] = arr[0][:b].dup

but that's still not good enough if arr were:

arr = [{a: "cat", b: [1,[2,3]]}]

What we need is a method that will make a deep copy. A common solution for that is to use the methods Marshal::dump and Marshal::load.

arr = [{a: "cat", b: [1,2]}]
str = Marshal.dump(arr)
#=> "\x04\b[\x06{\a:\x06aI\"\bcat\x06:\x06ET:\x06b[\ai\x06i\a"
arr1 = Marshal.load(str)
#=> [{:a=>"cat", :b=>[1, 2]}]

arr[0][:b] << 3
#=> [{:a=>"cat", :b=>[1, 2, 3]}]
arr
#=> [{:a=>"cat", :b=>[1, 2, 3]}]
arr1
#=> [{:a=>"cat", :b=>[1, 2]}]

Note we could write:

arr1 = Marshal.load(Marshal.dump(arr))

As explained in the doc, the serialization used by the Marshal methods is not necessarily the same for different Ruby versions. If, for example, dump were used to produce a string that was saved to file and later load was invoked on the contents of the file, using a different version of Ruby, the contents may not be readable. Of course that's not a problem in this application of the methods.

1. To make it easier to see differences in object id's I've only shown the last seven digits. They in all cases are preceded by the digits 4877798.



Related Topics



Leave a reply



Submit