How do I copy a hash in Ruby?
The clone
method is Ruby's standard, built-in way to do a shallow-copy:
h0 = {"John" => "Adams", "Thomas" => "Jefferson"}
# => {"John"=>"Adams", "Thomas"=>"Jefferson"}
h1 = h0.clone
# => {"John"=>"Adams", "Thomas"=>"Jefferson"}
h1["John"] = "Smith"
# => "Smith"
h1
# => {"John"=>"Smith", "Thomas"=>"Jefferson"}
h0
# => {"John"=>"Adams", "Thomas"=>"Jefferson"}
Note that the behavior may be overridden:
This method may have class-specific behavior. If so, that behavior will be documented under the
#initialize_copy
method of the class.
Duplicating a Hash in Ruby
To @pjs's point, Hash#dup
will 'do the right thing' for the top level of a hash. For nested hashes however, it still fails.
If you're open to using a gem, consider using deep_enumerable, a gem I wrote for exactly this purpose (among others).
DEFAULT_HASH = { a:{a:1, b:2}, b:{a:2, b:1} }
dupped = DEFAULT_HASH.dup
dupped[:a][:a] = 'updated'
puts "dupped: #{dupped.inspect}"
puts "DEFAULT_HASH: #{DEFAULT_HASH.inspect}"
require 'deep_enumerable'
DEFAULT_HASH = { a:{a:1, b:2}, b:{a:2, b:1} }
deep_dupped = DEFAULT_HASH.deep_dup
deep_dupped[:a][:a] = 'updated'
puts "deep_dupped: #{deep_dupped.inspect}"
puts "DEFAULT_HASH: #{DEFAULT_HASH.inspect}"
Output:
dupped: {:a=>{:a=>"updated", :b=>2}, :b=>{:a=>2, :b=>1}}
DEFAULT_HASH: {:a=>{:a=>"updated", :b=>2}, :b=>{:a=>2, :b=>1}}
deep_dupped: {:a=>{:a=>"updated", :b=>2}, :b=>{:a=>2, :b=>1}}
DEFAULT_HASH: {:a=>{:a=>1, :b=>2}, :b=>{:a=>2, :b=>1}}
Alternatively, you could try something along the lines of:
def deep_dup(h)
Hash[h.map{|k, v| [k,
if v.is_a?(Hash)
deep_dup(v)
else
v.dup rescue v
end
]}]
end
Note, this last function is nowhere near as well tested as deep_enumerable
.
Cloning a Hash in Ruby2
Hash is a collection of keys and values, where values are references to objects. When duplicating a hash, new hash is being created, but all object references are being copied, so as result you get new hash containing the same values. That is why this will work:
hash = {1 => 'Some string'} #Strings are mutable
hash2 = hash.clone
hash2[1] #=> 'Some string'
hash2[1].upcase! # modifying mutual object
hash[1] #=> 'SOME STRING; # so it appears modified on both hashes
hash2[1] = 'Other string' # changing reference on second hash to another object
hash[1] #=> 'SOME STRING' # original obejct has not been changed
hash2[2] = 'new value' # adding obejct to original hash
hash[2] #=> nil
If you want duplicate the referenced objects, you need to perform deep duplication. It is added in rails (activesupport gem) as deep_dup
method. If you are not using rails and don;t want to install the gem, you can write it like:
class Hash
def deep_dup
Hash[map {|key, value| [key, value.respond_to?(:deep_dup) ? value.deep_dup : begin
value.dup
rescue
value
end]}]
end
end
hash = {1 => 'Some string'} #Strings are mutable
hash2 = hash.deep_dup
hash2[1] #=> 'Some string'
hash2[1].upcase! # modifying referenced object
hash2[1] #=> 'SOME STRING'
hash[1] #=> 'Some string; # now other hash point to original object's clone
You probably should write something similar for arrays. I would also thought about writing it for whole enumerable module, but it might be slightly trickier.
Copy hash without pointing to the same object
You can use 'Marshal' to deep copy.
h1 = {:key_1 => {:sub_1 => "sub_1", :sub_2 => "sub_2"}}
h2 = Marshal.load(Marshal.dump(h1))
h2[:key_1][:sub_1] = "SUB_1"
h2[:key_1].delete(:sub_2)
p h1
# => {:key_1=>{:sub_1=>"sub_1", :sub_2=>"sub_2"}}
p h2
# => {:key_1=>{:sub_1=>"SUB_1"}}
How to clone array of hashes and add key value using each loop
Let's see what is happening.
arr = [{a: "cat", b: "dog"}, {a: "uno", b: "due"}]
arr.object_id
#=> 4557280
arr1 = arr
arr1.object_id
#=> 4557280
As you see, the variables arr
and arr1
hold the same object, because the objects have the same object id.1 Therefore, if that object is modified, arr
and arr1
will still both hold that object. Let's try it.
arr[0] = {a: "cat", b: "dog"}
arr
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"uno", :b=>"due"}]
arr.object_id
#=> 4557280
arr1
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"uno", :b=>"due"}]
arr1.object_id
#=> 4557280
If we want to be able to modify arr
in this way without it affecting arr1
, we use the method Kernel#dup.
arr
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"uno", :b=>"due"}]
arr1 = arr.dup
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"uno", :b=>"due"}]
arr.object_id
#=> 4557280
arr1.object_id
#=> 3693480
arr.map(&:object_id)
#=> [2631980, 4557300]
arr1.map(&:object_id)
#=> [2631980, 4557300]
As you see, arr
and arr1
now hold different objects. Those objects, however, are arrays whose corresponding elements (hashes) are the same objects. Let's modify one of arr
's elements.
arr[1][:a] = "owl"
arr
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"owl", :b=>"due"}]
arr.map(&:object_id)
#=> [2631980, 4557300]
arr
still contains the same objects, but we have modified one. Let's look at arr1
.
arr1
#=> [{:a=>"cat", :b=>"dog"}, {:a=>"owl", :b=>"due"}]
arr1.map(&:object_id)
#=> [2631980, 4557300]
Should we be surprised that arr1
has changed as well?
We need to dup
both arr
and the elements of arr
.
arr = [{a: "one", b: "two"}, {a: "uno", b: "due"}]
arr1 = arr.dup.map(&:dup)
#=> [{:a=>"one", :b=>"two"}, {:a=>"uno", :b=>"due"}]
arr.object_id
#=> 4149120
arr1.object_id
#=> 4182360
arr.map(&:object_id)
#=> [4149200, 4149140]
arr1.map(&:object_id)
#=> [4182340, 4182280]
Now arr
and arr1
are different objects and they contain different (hash) objects, so any change to one will not affect the other. (Try it.)
Now suppose arr
were as follows.
arr = [{a: "cat", b: [1,2]}]
Let's make the copy.
arr1 = arr.dup.map(&:dup)
#=> [{:a=>"cat", :b=>[1, 2]}]
Now modify arr[0][:b]
.
arr[0][:b] << 3
#=> [{:a=>"cat", :b=>[1, 2, 3]}]
arr1
#=> [{:a=>"cat", :b=>[1, 2, 3]}]
Drat! arr1
changed. We can again look at object ids to see why that happened.
arr.object_id
#=> 4488500
arr1.object_id
#=> 4503140
arr.map(&:object_id)
#=> [4488520]
arr1.map(&:object_id)
#=> [4503100]
arr[0][:b].object_id
#=> 4488560
arr1[0][:b].object_id
#=> 4488560
We see that arr
and arr1
are different objects and there respective hashes are the same elements, but the array is the same object for both hashes. We therefore need to do something like this:
arr1[0][:b] = arr[0][:b].dup
but that's still not good enough if arr
were:
arr = [{a: "cat", b: [1,[2,3]]}]
What we need is a method that will make a deep copy. A common solution for that is to use the methods Marshal::dump and Marshal::load.
arr = [{a: "cat", b: [1,2]}]
str = Marshal.dump(arr)
#=> "\x04\b[\x06{\a:\x06aI\"\bcat\x06:\x06ET:\x06b[\ai\x06i\a"
arr1 = Marshal.load(str)
#=> [{:a=>"cat", :b=>[1, 2]}]
arr[0][:b] << 3
#=> [{:a=>"cat", :b=>[1, 2, 3]}]
arr
#=> [{:a=>"cat", :b=>[1, 2, 3]}]
arr1
#=> [{:a=>"cat", :b=>[1, 2]}]
Note we could write:
arr1 = Marshal.load(Marshal.dump(arr))
As explained in the doc, the serialization used by the Marshal
methods is not necessarily the same for different Ruby versions. If, for example, dump
were used to produce a string that was saved to file and later load
was invoked on the contents of the file, using a different version of Ruby, the contents may not be readable. Of course that's not a problem in this application of the methods.
1. To make it easier to see differences in object id's I've only shown the last seven digits. They in all cases are preceded by the digits 4877798
.
Duplicate Hash Key unique Pair
When I look at the provide scenario I see the following solution:
data = [{:mobile=>21, :web=>43},{:mobile=>23, :web=>543},{:mobile=>23, :web=>430},{:mobile=>34, :web=>13},{:mobile=>26, :web=>893}]
keys = [:mobile, :web]
result = keys.zip(data.map { |hash| hash.values_at(*keys) }.transpose).to_h
#=> {:mobile=>[21, 23, 23, 34, 26], :web=>[43, 543, 430, 13, 893]}
This first extracts the values of the keys from each hash, then transposes the the resulting array. This changes [[21, 43], [23, 543], [23, 430], ...]
into [[21, 23, 23, ...], [43, 543, 430, ...]]
. This result can be zipped back to the keys and converted into a hash.
To get rid of duplicates you could add .each(&:uniq!)
after the transpose
call, or map the collections to a set .map(&:to_set)
(you need to require 'set'
) if you don't mind the values being sets instead of arrays.
result = keys.zip(data.map { |hash| hash.values_at(*keys) }.transpose.each(&:uniq!)).to_h
#=> {:mobile=>[21, 23, 34, 26], :web=>[43, 543, 430, 13, 893]}
require 'set'
result = keys.zip(data.map { |hash| hash.values_at(*keys) }.transpose.map(&:to_set)).to_h
#=> {:mobile=>#<Set: {21, 23, 34, 26}>, :web=>#<Set: {43, 543, 430, 13, 893}>}
References:
Array#map
Hash#values_at
- Splat operator
*
(inhash.values_at(*keys)
) Array#zip
Array#transpose
Array#to_h
Array#each
Array#uniq!
Enumerable#to_set
Understanding how hash copy is behaving under the hood
I just needed a little push to understand, and thanks to @Stefan I think I can answer my own question. Breaking it down we have:
root = {}
base = root
puts root.object_id
=> 47193371579760
puts base.object_id
=> 47193371579760
So both root
and base
became a reference for the same object.
base[:a] = {}
base[:a].object_id
=> 47193372751820
base = base[:a]
puts base.object_id
=> 47193372751820
puts root.object_id
=> 47193371579760
puts root
base[:a]
is a new hash object, and base
assigned to it becomes this object while root
keeps the reference for the old object that was assigned {:a=>{}}
. That's why root
doesn't change at the end.
Related Topics
How to Redirect to a 404 in Rails
Disable Activerecord For Rails 4
Which Ruby on Rails Is Compatible With Which Ruby Version
Understanding the "||" or Operator in If Conditionals in Ruby
How to Force Rails_Env in a Rake Task
Office 365 Rest API - Daemon Week Authentication
Is Ruby Pass by Reference or by Value
Why Doesn't Ruby Support I++ or I-- (Increment/Decrement Operators)
Do..End VS Curly Braces For Blocks in Ruby
Ruby Operator Precedence Table
Limitations in Running Ruby/Rails on Windows
When to Use 'Self.Foo' Instead of 'Foo' in Ruby Methods