How to create an operator for deep copy/cloning of objects in Ruby?
First of all, the syntax for superators is
superator ":=" do |operand|
#code
end
It's a block, because superator is a metaprogramming macro.
Secondly, you have something going their with Marshal
...but it's a bit of magic-ish. Feel free to use it as long as you understand exactly what it is you're doing.
Thirdly, what you are doing isn't quite doable with a superator (I believe), because self
cannot be modified during a function. (if someone knows otherwise, please let me know)
Also, in your example, a
must first exist and be defined before being able to call the method :=
in it.
Your best bet is probably:
class Object
def deep_clone
Marshal::load(Marshal.dump(self))
end
end
to generate a deep clone of an object.
a = (b = {}).deep_clone
b[1] = 2
p a # => {}
p b # => {1=>2}
How to create a deep copy of an object in Ruby?
Deep copy isn't built into vanilla Ruby, but you can hack it by marshalling and unmarshalling the object:
Marshal.load(Marshal.dump(@object))
This isn't perfect though, and won't work for all objects. A more robust method:
class Object
def deep_clone
return @deep_cloning_obj if @deep_cloning
@deep_cloning_obj = clone
@deep_cloning_obj.instance_variables.each do |var|
val = @deep_cloning_obj.instance_variable_get(var)
begin
@deep_cloning = true
val = val.deep_clone
rescue TypeError
next
ensure
@deep_cloning = false
end
@deep_cloning_obj.instance_variable_set(var, val)
end
deep_cloning_obj = @deep_cloning_obj
@deep_cloning_obj = nil
deep_cloning_obj
end
end
Source:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/43424
How can I clone an Object (deep copy) in Dart?
No as far as open issues seems to suggest:
https://github.com/dart-lang/sdk/issues/3367
And specifically:
... Objects have identity, and you can only pass around references to them. There is no implicit copying.
Multiple initialization of auto-vivifying hashes using a new operator in Ruby
Where you ask for a := b := c := AutoHash.new.few 3
I think (not sure I understand your desire) that you really want a,b,c=Autohash.new.few 3
Why does few take variable args, when you only ever use the first?
I also find your creation of the return value to be confusing, maybe try
def few(n=0)
Array.new(n) { AutoHash.new }
end
Beyond that, it seems like few
should be a class method. a,b,c=AutoHash.few 3
which will work if you defined few on the class:
def AutoHash.few(n=0)
Array.new(n) { AutoHash.new }
end
If a,b,c=AutoHash.few 3
isn't what you're looking for, and you really want to implement your own operator, then check out Hacking parse.y, which was a talk given at RubyConf 2009. You can watch the presentation at http://rubyconf2009.confreaks.com/19-nov-2009-17-15-hacking-parsey-tatsuhiro-ujihisa.html and you can see the slides at http://www.slideshare.net/ujihisa/hacking-parsey-rubyconf-2009
Ruby: How to use dup/clone to not mutate an original instance variable?
This is a common newbie mistake.
Suppose
a = [1, 2, 3]
b = a.dup
#=> [[1, 2], [3, 4]]
b[0] = 'cat'
#=> "cat"
b #=> ["cat", 2, 3]
a #=> [1, 2, 3]
This is exactly what you were expecting and hoping for. Now consider the following.
a = [[1, 2], [3, 4]]
b = a.dup
#=> [[1, 2], [3, 4]]
b[0] = 'cat'
b #=> ["cat", [3, 4]]
a #=> [[1, 2], [3, 4]]
Again, this is the desired result. One more:
a = [[1,2], [3,4]]
b = a.dup
#=> [[1,2], [3,4]]
b[0][0] = 'cat'
b #=> [["cat", 2], [3, 4]]
a #=> [["cat", 2], [3, 4]]
Aarrg! This is the problem that you experienced. To see what's happening here, let's look the id's of the various objects that make up a
and b
. Recall that every Ruby object has a unique Object#id.
a = [[1, 2], [3, 4]]
b = a.dup
a.map(&:object_id)
#=> [48959475855260, 48959475855240]
b.map(&:object_id)
#=> [48959475855260, 48959475855240]
b[0] = 'cat'
b #=> ["cat", [3, 4]]
a #=> [[1, 2], [3, 4]]
b.map(&:object_id)
#=> [48959476667580, 48959475855240]
Here we simply replace b[0]
, which initially was the object a[0]
with a different object ('cat'
) which of course has a different id. That does not affect a
. (In the following I will give just the last three digits of id's. If two are the same the entire id is the same.) Now consider the following.
a = [[1, 2], [3, 4]]
b = a.dup
a.map(&:object_id)
#=> [...620, ...600]
b.map(&:object_id)
#=> [...620, ...600]
b[0][0] = 'cat'
#=> "cat"
b #=> [["cat", 2], [3, 4]]
a #=> [["cat", 2], [3, 4]]
a.map(&:object_id)
#=> [...620, ...600]
b.map(&:object_id)
#=> [...620, ...600]
We see that the elements of a
and b
are the same objects as they were before executing b[0][0] = 'cat'
. That assignment, however, altered the value of the object whose id is ...620
, which explains why a
, as well as b
, was altered.
To avoid modifying a
we need to do the following.
a = [[1, 2], [3, 4]]
b = a.dup.map(&:dup) # same as a.dup.map { |arr| arr.dup }
#=> [[1, 2], [3, 4]]
a.map(&:object_id)
#=> [...180, ...120]
b.map(&:object_id)
#=> [...080, ...040]
Now the elements of b
are different objects than those of a
, so any changes to b
will not affect a
:
b[0][0] = 'cat'
#=> "cat"
b #=> [["cat", 2], [3, 4]]
a #=> [[1, 2], [3, 4]]
If we had
a = [[1, [2, 3]], [[4, 5], 6]]
we would need to dup
to three levels:
b = a.map { |arr0| arr0.dup.map { |arr1| arr1.dup } }
#=> [[1, [2, 3]], [[4, 5], 6]]
b[0][1][0] = 'cat'
b #=> [[1, ["cat", 3]], [[4, 5], 6]]
a #=> [[1, [2, 3]], [[4, 5], 6]]
and so on.
When to use dup, and when to use clone in Ruby?
It is true that clone
copies the frozen
state of an object, while dup
does not:
o = Object.new
o.freeze
o.clone.frozen?
#=> true
o.dup.frozen?
#=> false
clone
will also copy the singleton methods of the object while dup
does not:
o = Object.new
def o.foo
42
end
o.clone.respond_to?(:foo)
#=> true
o.dup.respond_to?(:foo)
#=> false
Which leads me to the assumption that clone
is sometimes understood as to provide a "deeper" copy than dup
. Here are some quotes about the topic:
Comment on ActiveRecord::Base#initialize_dup
from Rails 3:
Duped objects have no id assigned and are treated as new records. Note
that this is a "shallow" copy as it copies the object's attributes
only, not its associations. The extent of a "deep" copy is application
specific and is therefore left to the application to implement according
to its need.
An article about deep copies in Ruby:
There is another method worth mentioning,
clone
. Theclone
method does the same thing asdup
with one important distinction: it's expected that objects will override this method with one that can do deep copies.
But then again, theres deep_dup
in Rails 4:
Returns a deep copy of object if it's duplicable. If it's not duplicable, returns
self
.
and also ActiveRecord::Core#dup
and #clone
in Rails 4:
clone
— Identical to Ruby's clone method. This is a "shallow" copy. Be warned that your attributes are not copied. [...] If you need a copy of your attributes hash, please use the#dup
method.
Which means that here, the word dup
is used to refer to a deep clone again. As far as I can see, there seems to be no consensus in the community, except that you should use clone
and dup
in the case when you need a specific side effect of either one.
Finally, I see dup
much more often in Ruby code than clone
. I have never used clone
so far, and I won't until I explicitly need to.
What's the difference between Ruby's dup and clone methods?
Subclasses may override these methods to provide different semantics. In Object
itself, there are two key differences.
First, clone
copies the singleton class, while dup
does not.
o = Object.new
def o.foo
42
end
o.dup.foo # raises NoMethodError
o.clone.foo # returns 42
Second, clone
preserves the frozen state, while dup
does not.
class Foo
attr_accessor :bar
end
o = Foo.new
o.freeze
o.dup.bar = 10 # succeeds
o.clone.bar = 10 # raises RuntimeError
The Rubinius implementation for these methods
is often my source for answers to these questions, since it is quite clear, and a fairly compliant Ruby implementation.
Pass by reference or pass by copy - Ruby Modules
All variables in Ruby are references to objects. You cannot "pass by value" versus "pass by reference" in the same way as you have that choice in C, C++ or Perl. Ruby in fact forces pass by value, there are no options to do otherwise. However, the values that are sent are always references to objects. It's a bit like using C or C++ where all member variables are pointers, or using Perl where you must work with references at all times, even when working with simple scalars.
I think that it is this separation of variable from object data that is confusing you.
A few points:
Variable allocation never over-writes other variables that may point to the same object. This is pretty much the definition of pass-by-value. However this isn't meeting your expectations that object contents are also protected.
Instance variables, and items in containers (e.g. in
Array
s andString
s) are separate variables, and if you send a container you can alter its content directly, because you sent the reference to the container, and that includes the same variables for its contents. I think this is what you mean by "seems to be pass-by reference"Some classes - including those representing numbers, and
Symbol
- are immutable i.e. there are no change-in-place methods for the number4
. But conceptually you are still passing a reference to the singular object4
into a routine (under the hood, for efficiency Ruby will have the value4
encoded simply in the variable's memory allocation, but that is an implementation detail - the value is also the "pointer" in this case).The simplest way to get close to the "pass by value" semantics you seem to be looking for with
SampleModule
is toclone
the parameters at the start of the routine. Note this does not actually cause Ruby to change calling semantics, just that in this case from the outside of the method you get the safe assumption (whatever happens to the param inside the method stays inside the method) that you seem to want:
module SampleModule
def self.testing(o)
o = o.clone
o.test
end
end
- Technically this should be a deep clone to be generic, but that wouldn't be required to make your example work close to a pass-by-value. You could call
SampleModule.testing( any_var_or_expression )
and know that whateverany_var_or_expression
is in the rest of your code, the associated object will not have been changed.
clone File object with offset
If moving in start-to-end direction:
#!/usr/bin/env ruby
f = File.open('/home/yuri/_/1.txt')
def f.dup
r = File.open path
r.seek pos
r
end
p f.gets
f2 = f.dup
p f.gets
p f2.gets
Output:
"1\n"
"2\n"
"2\n"
If moving backwards:
#!/usr/bin/env ruby
require 'elif'
f = Elif.open('/home/yuri/_/1.txt')
def f.dup
file = instance_variable_get(:@file)
r = Elif.open file.path
r.instance_variable_get(:@file).seek file.pos
instance_variables.select{ |n| n != :@file }.each{ |n|
r.instance_variable_set n, begin
instance_variable_get(n).dup
rescue TypeError => e
# !!! not sure about the following line
raise unless e.message == "can't dup %s" % instance_variable_get(n).class.name
instance_variable_get(n)
end
}
r
end
p f.gets
f2 = f.dup
p f.gets
p f2.gets
Output:
"4\n"
"3\n"
"3\n"
1.txt
:
1
2
3
4
Related Topics
How to Access Current_User Object in Model
Ruby Private Attr_Accessor and Unexpected Nil
How to Run an Excel MACro from Ruby
Ruby Is Already Using the Class Name of My Model
How to Optimize Graphviz Output Width
Behaviour of Array Bang Methods
Rails - X-Sendfile + Temporary Files
Ruby Defining Operator Procedure
Installing MySQL-2.9.0 Gem on Windows Fails Due to Lack of Libmysql
Selenium2 Webdriver Ruby => How Click on a Hidden Link
Nested Form_For Singular Resource
Ruby on Rails: Params Is Nil. Undefined Method '[]' for Nil:Nilclass
Bundle Exec Jekyll Serve: Cannot Load Such File
Is There an Elegant Way to Exclude the First Value of a Range