What's the most efficient way to deep copy an object in Ruby?
I was wondering the same thing, so I benchmarked a few different techniques against each other. I was primarily concerned with Arrays and Hashes - I didn't test any complex objects. Perhaps unsurprisingly, a custom deep-clone implementation proved to be the fastest. If you are looking for quick and easy implementation, Marshal appears to be the way to go.
I also benchmarked an XML solution with Rails 3.0.7, not shown below. It was much, much slower, ~10 seconds for only 1000 iterations (the solutions below all ran 10,000 times for the benchmark).
Two notes regarding my JSON solution. First, I used the C variant, version 1.4.3. Second, it doesn't actually work 100%, as symbols will be converted to Strings.
This was all run with ruby 1.9.2p180.
#!/usr/bin/env ruby
require 'benchmark'
require 'yaml'
require 'json/ext'
require 'msgpack'
def dc1(value)
Marshal.load(Marshal.dump(value))
end
def dc2(value)
YAML.load(YAML.dump(value))
end
def dc3(value)
JSON.load(JSON.dump(value))
end
def dc4(value)
if value.is_a?(Hash)
result = value.clone
value.each{|k, v| result[k] = dc4(v)}
result
elsif value.is_a?(Array)
result = value.clone
result.clear
value.each{|v| result << dc4(v)}
result
else
value
end
end
def dc5(value)
MessagePack.unpack(value.to_msgpack)
end
value = {'a' => {:x => [1, [nil, 'b'], {'a' => 1}]}, 'b' => ['z']}
Benchmark.bm do |x|
iterations = 10000
x.report {iterations.times {dc1(value)}}
x.report {iterations.times {dc2(value)}}
x.report {iterations.times {dc3(value)}}
x.report {iterations.times {dc4(value)}}
x.report {iterations.times {dc5(value)}}
end
results in:
user system total real
0.230000 0.000000 0.230000 ( 0.239257) (Marshal)
3.240000 0.030000 3.270000 ( 3.262255) (YAML)
0.590000 0.010000 0.600000 ( 0.601693) (JSON)
0.060000 0.000000 0.060000 ( 0.067661) (Custom)
0.090000 0.010000 0.100000 ( 0.097705) (MessagePack)
How to create a deep copy of an object in Ruby?
Deep copy isn't built into vanilla Ruby, but you can hack it by marshalling and unmarshalling the object:
Marshal.load(Marshal.dump(@object))
This isn't perfect though, and won't work for all objects. A more robust method:
class Object
def deep_clone
return @deep_cloning_obj if @deep_cloning
@deep_cloning_obj = clone
@deep_cloning_obj.instance_variables.each do |var|
val = @deep_cloning_obj.instance_variable_get(var)
begin
@deep_cloning = true
val = val.deep_clone
rescue TypeError
next
ensure
@deep_cloning = false
end
@deep_cloning_obj.instance_variable_set(var, val)
end
deep_cloning_obj = @deep_cloning_obj
@deep_cloning_obj = nil
deep_cloning_obj
end
end
Source:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/43424
Ruby: object deep copying
This is a really thin, very specific implementation of a "deep copy". What it's demonstrating is creating an independent @name
instance variable in the clone so that modifying the name of one with an in-place operation won't have the side-effect of changing the clone.
Normally deep-copy operations are important for things like nested arrays or hashes, but they're also applicable to any object with attributes that refer to things of that sort.
In your case, to make an object with a more robust dup
method, you should call dup
on each of the attributes in question, but I think this example is a bit broken. What it does is replace the @name
in the original with a copy, which may break any references you have.
A better version is:
def dup
copy = super
copy.make_independent!
copy
end
def make_independent!
instance_variables.each do |var|
value = instance_variable_get(var)
if (value.respond_to?(:dup))
instance_variable_set(var, value.dup)
end
end
end
This should have the effect of duplicating any instance variables which support the dup
method. This skips things like numbers, booleans, and nil
which can't be duplicated.
Methods to create deep copy of objects without the help of Marshal
This solution works
class CashRegister
attr_accessor :bills
def initialize
@bills = []
end
def clone
cloned = super
cloned.bills = @bills.map { |bill| bill.clone }
cloned
end
end
class Bill
attr_accessor :positions
def initialize(nr)
@nr = nr
@positions = []
end
def clone
cloned = super
cloned.positions = @positions.map{ |pos| pos.clone }
cloned
end
end
class Position
attr_reader :price
attr_writer :product
# this method is given
def product
@product.clone
end
def initialize(product, price)
@product = product
@price = price
end
def clone
cloned = super
cloned.product = product
cloned
end
end
Why isn't there a deep copy method in Ruby?
I'm not sure why there's no deep copy method in Ruby, but I'll try to make an educated guess based on the information I could find (see links and quotes below the line).
Judging from this information, I could only infer that the reason Ruby does not have a deep copy method is because it's very rarely necessary and, in the few cases where it truly is necessary, there are other, relatively simple ways to accomplish the same task:
As you already know, using Marshal.dump
and Marshal.load
is currently the recommended way to do this. This is also the approach recommended by Programming Ruby (see excerpts below).
Alternatively, there are at least 3 available implementations found in these gems: deep_cloneable
, deep_clone
and ruby_deep_clone
; the first being the most popular.
Related Information
Here's a discussion over at comp.lang.ruby which might shed some light on this. There's another answer here with some associated discussions, but it all comes back to using Marshal
.
There weren't any mentions of deep copying in Programming Ruby, but there were a few mentions in The Ruby Programming Language. Here are a few related excerpts:
[…]
Another use for
Marshal.dump
andMarshal.load
is to create deep copies
of objects:def deepcopy(o)
Marshal.load(Marshal.dump(o))
end
[…]
… the binary format used by
Marshal.dump
andMarshal.load
is
version-dependent, and newer versions of Ruby are not guaranteed to be
able to read marshalled objects written by older versions of Ruby.[…]
Note that files and I/O streams, as well as Method and Binding
objects, are too dynamic to be marshalled; there would be no reliable
way to restore their state.[…]
Instead of making a defensive deep copy of the array, just call
to_enum
on it, and pass the resulting enumerator instead of the array
itself. In effect, you’re creating an enumerable but immutable proxy
object for your array.
Provide simplest example where deep copy is needed in ruby
The example you have shown does not describe the difference between a deep and a shallow copy. Instead, consider this example:
class Klass
attr_accessor :name
end
anna = Klass.new
anna.name = 'Anna'
anna_lisa = anna.dup
anna_lisa.name << ' Lisa'
# => "Anna Lisa"
anna.name
# => "Anna Lisa"
Generally, dup
and clone
are both expected to just duplicate the actual object you are calling the method on. No other referenced objects like the name
String in the above example are duplicated. Thus, after the duplication, both, the original and the duplicated object point to the very same name string.
With a deep_dup
, typically all (relevant) referenced objects are duplicated too, often to an infinite depth. Since this is rather hard to achieve for all possible object references, often people rely on implementation for specific objects like hashes and arrays.
A common workaround for a rather generic deep-dup is to use Ruby's Marshal class to serialize an object graph and directly unserializing it again.
anna_lena = Marshal.load( Marshal.dump(anna))
This creates new objects and is effectively a deep_dup. Since most objects support marshaling right away, this is a rather powerful mechanism. Note though than you should never unmarshal (i.e. load
) user-provided data since this will lead to a remote-code execution vulnerability.
Ruby on Rails deep copy/ deep clone of object and its attributes
You should clone every trial and assign them to the cloned experiment:
@experiment_new = @experiment_old.clone
@experiment_old.trials.each do |trial|
@experiment_new.trials << trial.clone
end
Related Topics
How to Put Assertions in Ruby Code
Iterate Every Month with Date Objects
How to Get Request.Uri in Model in Rails
Creating Categories on Jekyll Driven Site
How to Read the Body Text of an Email Using Ruby's Net/Imap Library
Ruby Classes: Initialize Self VS. @Variable
How to Use Nokogiri to Parse an Xml File
Starting or Restarting Unicorn with Capistrano 3.X
How to Get Searchlogic to Work with Rails 3
Cool Tricks and Expressive Snippets with Ruby Collections/Enumerables
Cleanest Way to Create a Hash from an Array
Code to Generate Gaussian (Normally Distributed) Random Numbers in Ruby
How to Check from Ruby Whether a Process with a Certain Pid Is Running