How to Create an Operator for Deep Copy/Cloning of Objects in Ruby

How to create an operator for deep copy/cloning of objects in Ruby?

First of all, the syntax for superators is

superator ":=" do |operand|
#code
end

It's a block, because superator is a metaprogramming macro.

Secondly, you have something going their with Marshal...but it's a bit of magic-ish. Feel free to use it as long as you understand exactly what it is you're doing.

Thirdly, what you are doing isn't quite doable with a superator (I believe), because self cannot be modified during a function. (if someone knows otherwise, please let me know)

Also, in your example, a must first exist and be defined before being able to call the method := in it.

Your best bet is probably:

class Object
def deep_clone
Marshal::load(Marshal.dump(self))
end
end

to generate a deep clone of an object.

a = (b = {}).deep_clone
b[1] = 2
p a # => {}
p b # => {1=>2}

How to create a deep copy of an object in Ruby?

Deep copy isn't built into vanilla Ruby, but you can hack it by marshalling and unmarshalling the object:

Marshal.load(Marshal.dump(@object))

This isn't perfect though, and won't work for all objects. A more robust method:

class Object
def deep_clone
return @deep_cloning_obj if @deep_cloning
@deep_cloning_obj = clone
@deep_cloning_obj.instance_variables.each do |var|
val = @deep_cloning_obj.instance_variable_get(var)
begin
@deep_cloning = true
val = val.deep_clone
rescue TypeError
next
ensure
@deep_cloning = false
end
@deep_cloning_obj.instance_variable_set(var, val)
end
deep_cloning_obj = @deep_cloning_obj
@deep_cloning_obj = nil
deep_cloning_obj
end
end

Source:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/43424

How can I clone an Object (deep copy) in Dart?

No as far as open issues seems to suggest:

https://github.com/dart-lang/sdk/issues/3367

And specifically:

... Objects have identity, and you can only pass around references to them. There is no implicit copying.

Multiple initialization of auto-vivifying hashes using a new operator in Ruby

Where you ask for a := b := c := AutoHash.new.few 3 I think (not sure I understand your desire) that you really want a,b,c=Autohash.new.few 3


Why does few take variable args, when you only ever use the first?

I also find your creation of the return value to be confusing, maybe try

def few(n=0) 
Array.new(n) { AutoHash.new }
end

Beyond that, it seems like few should be a class method. a,b,c=AutoHash.few 3 which will work if you defined few on the class:

def AutoHash.few(n=0)
Array.new(n) { AutoHash.new }
end

If a,b,c=AutoHash.few 3 isn't what you're looking for, and you really want to implement your own operator, then check out Hacking parse.y, which was a talk given at RubyConf 2009. You can watch the presentation at http://rubyconf2009.confreaks.com/19-nov-2009-17-15-hacking-parsey-tatsuhiro-ujihisa.html and you can see the slides at http://www.slideshare.net/ujihisa/hacking-parsey-rubyconf-2009

Ruby: How to use dup/clone to not mutate an original instance variable?

This is a common newbie mistake.

Suppose

a = [1, 2, 3]
b = a.dup
#=> [[1, 2], [3, 4]]
b[0] = 'cat'
#=> "cat"
b #=> ["cat", 2, 3]
a #=> [1, 2, 3]

This is exactly what you were expecting and hoping for. Now consider the following.

a = [[1, 2], [3, 4]]
b = a.dup
#=> [[1, 2], [3, 4]]
b[0] = 'cat'
b #=> ["cat", [3, 4]]
a #=> [[1, 2], [3, 4]]

Again, this is the desired result. One more:

a = [[1,2], [3,4]]
b = a.dup
#=> [[1,2], [3,4]]
b[0][0] = 'cat'
b #=> [["cat", 2], [3, 4]]
a #=> [["cat", 2], [3, 4]]

Aarrg! This is the problem that you experienced. To see what's happening here, let's look the id's of the various objects that make up a and b. Recall that every Ruby object has a unique Object#id.

a = [[1, 2], [3, 4]]
b = a.dup
a.map(&:object_id)
#=> [48959475855260, 48959475855240]
b.map(&:object_id)
#=> [48959475855260, 48959475855240]
b[0] = 'cat'
b #=> ["cat", [3, 4]]
a #=> [[1, 2], [3, 4]]
b.map(&:object_id)
#=> [48959476667580, 48959475855240]

Here we simply replace b[0], which initially was the object a[0] with a different object ('cat') which of course has a different id. That does not affect a. (In the following I will give just the last three digits of id's. If two are the same the entire id is the same.) Now consider the following.

a = [[1, 2], [3, 4]]
b = a.dup
a.map(&:object_id)
#=> [...620, ...600]
b.map(&:object_id)
#=> [...620, ...600]
b[0][0] = 'cat'
#=> "cat"
b #=> [["cat", 2], [3, 4]]
a #=> [["cat", 2], [3, 4]]
a.map(&:object_id)
#=> [...620, ...600]
b.map(&:object_id)
#=> [...620, ...600]

We see that the elements of a and b are the same objects as they were before executing b[0][0] = 'cat'. That assignment, however, altered the value of the object whose id is ...620, which explains why a, as well as b, was altered.

To avoid modifying a we need to do the following.

a = [[1, 2], [3, 4]]
b = a.dup.map(&:dup) # same as a.dup.map { |arr| arr.dup }
#=> [[1, 2], [3, 4]]
a.map(&:object_id)
#=> [...180, ...120]
b.map(&:object_id)
#=> [...080, ...040]

Now the elements of b are different objects than those of a, so any changes to b will not affect a:

b[0][0] = 'cat'
#=> "cat"
b #=> [["cat", 2], [3, 4]]
a #=> [[1, 2], [3, 4]]

If we had

a = [[1, [2, 3]], [[4, 5], 6]]

we would need to dup to three levels:

b = a.map { |arr0| arr0.dup.map { |arr1| arr1.dup } }
#=> [[1, [2, 3]], [[4, 5], 6]]
b[0][1][0] = 'cat'
b #=> [[1, ["cat", 3]], [[4, 5], 6]]
a #=> [[1, [2, 3]], [[4, 5], 6]]

and so on.

When to use dup, and when to use clone in Ruby?

It is true that clone copies the frozen state of an object, while dup does not:

o = Object.new
o.freeze

o.clone.frozen?
#=> true

o.dup.frozen?
#=> false

clone will also copy the singleton methods of the object while dup does not:

o = Object.new
def o.foo
42
end

o.clone.respond_to?(:foo)
#=> true

o.dup.respond_to?(:foo)
#=> false

Which leads me to the assumption that clone is sometimes understood as to provide a "deeper" copy than dup. Here are some quotes about the topic:

Comment on ActiveRecord::Base#initialize_dup from Rails 3:

Duped objects have no id assigned and are treated as new records. Note
that this is a "shallow" copy as it copies the object's attributes
only, not its associations. The extent of a "deep" copy is application
specific and is therefore left to the application to implement according
to its need.

An article about deep copies in Ruby:

There is another method worth mentioning, clone. The clone method does the same thing as dup with one important distinction: it's expected that objects will override this method with one that can do deep copies.

But then again, theres deep_dup in Rails 4:

Returns a deep copy of object if it's duplicable. If it's not duplicable, returns self.

and also ActiveRecord::Core#dup and #clone in Rails 4:

clone — Identical to Ruby's clone method. This is a "shallow" copy. Be warned that your attributes are not copied. [...] If you need a copy of your attributes hash, please use the #dup method.

Which means that here, the word dup is used to refer to a deep clone again. As far as I can see, there seems to be no consensus in the community, except that you should use clone and dup in the case when you need a specific side effect of either one.

Finally, I see dup much more often in Ruby code than clone. I have never used clone so far, and I won't until I explicitly need to.

What's the difference between Ruby's dup and clone methods?

Subclasses may override these methods to provide different semantics. In Object itself, there are two key differences.

First, clone copies the singleton class, while dup does not.

o = Object.new
def o.foo
42
end

o.dup.foo # raises NoMethodError
o.clone.foo # returns 42

Second, clone preserves the frozen state, while dup does not.

class Foo
attr_accessor :bar
end
o = Foo.new
o.freeze

o.dup.bar = 10 # succeeds
o.clone.bar = 10 # raises RuntimeError

The Rubinius implementation for these methods
is often my source for answers to these questions, since it is quite clear, and a fairly compliant Ruby implementation.

Pass by reference or pass by copy - Ruby Modules

All variables in Ruby are references to objects. You cannot "pass by value" versus "pass by reference" in the same way as you have that choice in C, C++ or Perl. Ruby in fact forces pass by value, there are no options to do otherwise. However, the values that are sent are always references to objects. It's a bit like using C or C++ where all member variables are pointers, or using Perl where you must work with references at all times, even when working with simple scalars.

I think that it is this separation of variable from object data that is confusing you.

A few points:

  • Variable allocation never over-writes other variables that may point to the same object. This is pretty much the definition of pass-by-value. However this isn't meeting your expectations that object contents are also protected.

  • Instance variables, and items in containers (e.g. in Arrays and Strings) are separate variables, and if you send a container you can alter its content directly, because you sent the reference to the container, and that includes the same variables for its contents. I think this is what you mean by "seems to be pass-by reference"

  • Some classes - including those representing numbers, and Symbol - are immutable i.e. there are no change-in-place methods for the number 4. But conceptually you are still passing a reference to the singular object 4 into a routine (under the hood, for efficiency Ruby will have the value 4 encoded simply in the variable's memory allocation, but that is an implementation detail - the value is also the "pointer" in this case).

  • The simplest way to get close to the "pass by value" semantics you seem to be looking for with SampleModule is to clone the parameters at the start of the routine. Note this does not actually cause Ruby to change calling semantics, just that in this case from the outside of the method you get the safe assumption (whatever happens to the param inside the method stays inside the method) that you seem to want:


module SampleModule
def self.testing(o)
o = o.clone
o.test
end
end
  • Technically this should be a deep clone to be generic, but that wouldn't be required to make your example work close to a pass-by-value. You could call SampleModule.testing( any_var_or_expression ) and know that whatever any_var_or_expression is in the rest of your code, the associated object will not have been changed.

clone File object with offset

If moving in start-to-end direction:

#!/usr/bin/env ruby
f = File.open('/home/yuri/_/1.txt')
def f.dup
r = File.open path
r.seek pos
r
end
p f.gets
f2 = f.dup
p f.gets
p f2.gets

Output:

"1\n"
"2\n"
"2\n"

If moving backwards:

#!/usr/bin/env ruby
require 'elif'
f = Elif.open('/home/yuri/_/1.txt')
def f.dup
file = instance_variable_get(:@file)
r = Elif.open file.path
r.instance_variable_get(:@file).seek file.pos
instance_variables.select{ |n| n != :@file }.each{ |n|
r.instance_variable_set n, begin
instance_variable_get(n).dup
rescue TypeError => e
# !!! not sure about the following line
raise unless e.message == "can't dup %s" % instance_variable_get(n).class.name
instance_variable_get(n)
end
}
r
end
p f.gets
f2 = f.dup
p f.gets
p f2.gets

Output:

"4\n"
"3\n"
"3\n"

1.txt:

1
2
3
4


Related Topics



Leave a reply



Submit