Why does Hash.new({}) hide hash members?
It is expected behaviour (across all ruby versions). And if you experiment a bit further, you'll see that you always access the same hash, no matter which key you use:
>> a[:a][:b] = 1
=> 1
>> a[:c][:d] = 2
=> 2
>> a[:d]
=> {:b=>1, :d=>2}
The way Hash.new with a default argument works is: If you do hash[key]
it checks whether that key exists in the hash. If it does, it returns the value for that key. If not it returns the default value. It does not add the key to the hash and it will return the same default object (not a copy) every time.
To get what you want, you want to specify a default block instead. That way, the block will be executed every time you access a key that is not in the hash. Inside the block you can create a new Hash and set the key to "point" to that hash. Like so:
Hash.new { |h,k| h[k] = {} }
Strange, unexpected behavior (disappearing/changing values) when using Hash default value, e.g. Hash.new([])
First, note that this behavior applies to any default value that is subsequently mutated (e.g. hashes and strings), not just arrays. It also applies similarly to the populated elements in Array.new(3, [])
.
TL;DR: Use Hash.new { |h, k| h[k] = [] }
if you want the most idiomatic solution and don’t care why.
What doesn’t work
Why Hash.new([])
doesn’t work
Let’s look more in-depth at why Hash.new([])
doesn’t work:
h = Hash.new([])
h[0] << 'a' #=> ["a"]
h[1] << 'b' #=> ["a", "b"]
h[1] #=> ["a", "b"]
h[0].object_id == h[1].object_id #=> true
h #=> {}
We can see that our default object is being reused and mutated (this is because it is passed as the one and only default value, the hash has no way of getting a fresh, new default value), but why are there no keys or values in the array, despite h[1]
still giving us a value? Here’s a hint:
h[42] #=> ["a", "b"]
The array returned by each []
call is just the default value, which we’ve been mutating all this time so now contains our new values. Since <<
doesn’t assign to the hash (there can never be assignment in Ruby without an =
present†), we’ve never put anything into our actual hash. Instead we have to use <<=
(which is to <<
as +=
is to +
):
h[2] <<= 'c' #=> ["a", "b", "c"]
h #=> {2=>["a", "b", "c"]}
This is the same as:
h[2] = (h[2] << 'c')
Why Hash.new { [] }
doesn’t work
Using Hash.new { [] }
solves the problem of reusing and mutating the original default value (as the block given is called each time, returning a new array), but not the assignment problem:
h = Hash.new { [] }
h[0] << 'a' #=> ["a"]
h[1] <<= 'b' #=> ["b"]
h #=> {1=>["b"]}
What does work
The assignment way
If we remember to always use <<=
, then Hash.new { [] }
is a viable solution, but it’s a bit odd and non-idiomatic (I’ve never seen <<=
used in the wild). It’s also prone to subtle bugs if <<
is inadvertently used.
The mutable way
The documentation for Hash.new
states (emphasis my own):
If a block is specified, it will be called with the hash object and the key, and should return the default value. It is the block’s responsibility to store the value in the hash if required.
So we must store the default value in the hash from within the block if we wish to use <<
instead of <<=
:
h = Hash.new { |h, k| h[k] = [] }
h[0] << 'a' #=> ["a"]
h[1] << 'b' #=> ["b"]
h #=> {0=>["a"], 1=>["b"]}
This effectively moves the assignment from our individual calls (which would use <<=
) to the block passed to Hash.new
, removing the burden of unexpected behavior when using <<
.
Note that there is one functional difference between this method and the others: this way assigns the default value upon reading (as the assignment always happens inside the block). For example:
h1 = Hash.new { |h, k| h[k] = [] }
h1[:x]
h1 #=> {:x=>[]}
h2 = Hash.new { [] }
h2[:x]
h2 #=> {}
The immutable way
You may be wondering why Hash.new([])
doesn’t work while Hash.new(0)
works just fine. The key is that Numerics in Ruby are immutable, so we naturally never end up mutating them in-place. If we treated our default value as immutable, we could use Hash.new([])
just fine too:
h = Hash.new([].freeze)
h[0] += ['a'] #=> ["a"]
h[1] += ['b'] #=> ["b"]
h[2] #=> []
h #=> {0=>["a"], 1=>["b"]}
However, note that ([].freeze + [].freeze).frozen? == false
. So, if you want to ensure that the immutability is preserved throughout, then you must take care to re-freeze the new object.
Conclusion
Of all the ways, I personally prefer “the immutable way”—immutability generally makes reasoning about things much simpler. It is, after all, the only method that has no possibility of hidden or subtle unexpected behavior. However, the most common and idiomatic way is “the mutable way”.
As a final aside, this behavior of Hash default values is noted in Ruby Koans.
† This isn’t strictly true, methods like instance_variable_set
bypass this, but they must exist for metaprogramming since the l-value in =
cannot be dynamic.
Hash default value not being used
When you did a[:key] << 2
, you slipped that empty array default value out and added 2 to it (modifying the actual array, not the reference) without letting the hash object a
know that you had changed anything. You modified the object that a
was using as a default, so you will see this as well:
p a[:wat] #=> [2]
p a[:anything] #=> [2]
In the second example, you made a new array, and use b[:key]=
which tells b
that it has a value under that key.
Try this if you want the best of both worlds:
c = Hash.new([])
c[:key] += [2]
This will access c[:key]
and make a new array with +
and reassign it.
How to remove a key from Hash and get the remaining hash in Ruby/Rails?
Rails has an except/except! method that returns the hash with those keys removed. If you're already using Rails, there's no sense in creating your own version of this.
class Hash
# Returns a hash that includes everything but the given keys.
# hash = { a: true, b: false, c: nil}
# hash.except(:c) # => { a: true, b: false}
# hash # => { a: true, b: false, c: nil}
#
# This is useful for limiting a set of parameters to everything but a few known toggles:
# @person.update(params[:person].except(:admin))
def except(*keys)
dup.except!(*keys)
end
# Replaces the hash without the given keys.
# hash = { a: true, b: false, c: nil}
# hash.except!(:c) # => { a: true, b: false}
# hash # => { a: true, b: false }
def except!(*keys)
keys.each { |key| delete(key) }
self
end
end
Is this correct behaviour for a Ruby hash with a default value?
What's going on? Ruby's hiding data (1.9.3p125)
Ruby hides neither data nor its docs.
Default value you pass into the Hash
constructor is returned whenever the key is not found in the hash. But this default value is never actually stored into the hash on its own.
To get what you want you should use Hash
constructor with block and store default value into the hash yourself (on both levels of your nested hash):
hash = Hash.new { |hash, key| hash[key] = Hash.new { |h, k| h[k] = [] } }
hash[1][2] << 3
p hash[1][2] #=> [3]
p hash #=> {1=>{2=>[3]}}
p hash.keys #=> [1]
p hash.values #=> [{2=>[3]}]
How to optimize mapping hash that contains similar keys and values?
- Use Symbols instead of constants.
- Don't expose the mapping.
Constants in Ruby are mostly about information hiding. For example, if the key changes from consumer1
to consumer_1
as long as everything accesses the Hash with CONSUMER_1_TYPE
you're ok. Why risk it?
Instead, fully hide the Hash. Now that it's hidden, constants are not necessary. Use Symbols.
If all the values are going to be the same, put them into their own methods.
def classification_attributes(product_type)
product_type_mapping[product_type]
end
private def consumer_config
{ abc: abc, vpn: vpn, lbc: lbc }
end
private def industrial_config
{ vpn: vpn, htt: htt, bnn: bnn }
end
private def services_config
{ dhy: dhy, rtt: rtt, abc: abc }
end
private def product_type_mapping
{
conumser1: consumer_config,
consumer2: consumer_config,
consumer3: consumer_config,
industrial1: industrial_config,
industrial2: industrial_config,
industrial3: industrial_config,
services1: services_config,
services2: services_config,
services3: services_config
}
end
That's about as far as I can say without more context. If there's that much redundancy you may be able to split product_type
into type and subtype.
Consider moving product_type_mapping
into config/application.rb, plus any other related configurations. This keeps the application configuration in one place, not scattered around in various classes.
module YourApp
class Application < Rails::Application
config.x.consumer_config = { abc: abc, vpn: vpn, lbc: lbc }.freeze
config.x.industrial_config = { vpn: vpn, htt: htt, bnn: bnn }.freeze
config.x.services_config = { dhy: dhy, rtt: rtt, abc: abc }.freeze
config.x.product_type_mapping = {
conumser1: config.x.consumer_config,
consumer2: config.x.consumer_config,
consumer3: config.x.consumer_config,
industrial1: config.x.industrial_config,
industrial2: config.x.industrial_config,
industrial3: config.x.industrial_config,
services1: config.x.services_config,
services2: config.x.services_config,
services3: config.x.services_config
}.freeze
end
end
# in your class...
def classification_attributes(product_type)
Rails.configuration.x.product_type_mapping[product_type]
end
How to avoid NoMethodError for missing elements in nested hashes, without repeated nil checks?
Ruby 2.3.0 introduced a new method called dig
on both Hash
and Array
that solves this problem entirely.
name = params.dig(:company, :owner, :name)
It returns nil
if the key is missing at any level.
If you are using a version of Ruby older than 2.3, you can use the ruby_dig gem or implement it yourself:
module RubyDig
def dig(key, *rest)
if value = (self[key] rescue nil)
if rest.empty?
value
elsif value.respond_to?(:dig)
value.dig(*rest)
end
end
end
end
if RUBY_VERSION < '2.3'
Array.send(:include, RubyDig)
Hash.send(:include, RubyDig)
end
Related Topics
No Implicit Conversion from Nil to Integer - When Trying to Add Anything to Array
Activeadmin Custom Views Which Retain the Activeadmin Layout
How to Use Escape Characters in Strings
How to Run All Ruby Scripts with Warnings
Find Where Associated Records Exist
How to Dynamically Define a Method as Private
Ruby Can Not Access Variable Outside the Method
How to Force a Gem's Dependencies in Gemfile
Creating Spectral Heat Maps or Intensity Maps from Cdip Data Using Ruby
How Get Best Performance Rails Requests Parallel Sidekiq Worker
Generate Array of Numbers That Fit to a Probability Distribution in Ruby
Is There an Equivalent of Array#Find_Index for the Last Index in Ruby
How to Make Ruby's Restclient Gem Respect Content_Type on Post
How to Remove the Zone from a Datetime Value
Another Way Instead of Escaping Regex Patterns
Rails: How to to Download a File from a Http and Save It into Database
Finding the Product of a Variable Number of Ruby Arrays
How to Destroy a Record Without an Id Column in Ruby Activerecord