Hashes of Hashes Idiom in Ruby

Hashes of Hashes Idiom in Ruby?

You can pass the Hash.new function a block that is executed to yield a default value in case the queried value doesn't exist yet:

h = Hash.new { |h, k| h[k] = Hash.new }

Of course, this can be done recursively. There's an article explaining the details.

For the sake of completeness, here's the solution from the article for arbitrary depth hashes:

hash = Hash.new(&(p = lambda{|h, k| h[k] = Hash.new(&p)}))

The person to originally come up with this solution is Kent Sibilev.

Hash of Hashes in Ruby

Use a vivified hash as shown in "Dynamically creating a multi-dimensional hash in Ruby"

Or use group_by.

Is there a better Ruby or Rails idiom for checking for the presence of values in a nested hash?

You can always use try:

hsh.try(:[], 'first_key').try(:[], 'second_key')

FYI: if you're doing a lot of these checks, you might want to refactor your code to avoid these situations.

Accessing elements of nested hashes in ruby

The way I usually do this these days is:

h = Hash.new { |h,k| h[k] = {} }

This will give you a hash that creates a new hash as the entry for a missing key, but returns nil for the second level of key:

h['foo'] -> {}
h['foo']['bar'] -> nil

You can nest this to add multiple layers that can be addressed this way:

h = Hash.new { |h, k| h[k] = Hash.new { |hh, kk| hh[kk] = {} } }

h['bar'] -> {}
h['tar']['zar'] -> {}
h['scar']['far']['mar'] -> nil

You can also chain indefinitely by using the default_proc method:

h = Hash.new { |h, k| h[k] = Hash.new(&h.default_proc) }

h['bar'] -> {}
h['tar']['star']['par'] -> {}

The above code creates a hash whose default proc creates a new Hash with the same default proc. So, a hash created as a default value when a lookup for an unseen key occurs will have the same default behavior.

EDIT: More details

Ruby hashes allow you to control how default values are created when a lookup occurs for a new key. When specified, this behavior is encapsulated as a Proc object and is reachable via the default_proc and default_proc= methods. The default proc can also be specified by passing a block to Hash.new.

Let's break this code down a little. This is not idiomatic ruby, but it's easier to break it out into multiple lines:

1. recursive_hash = Hash.new do |h, k|
2. h[k] = Hash.new(&h.default_proc)
3. end

Line 1 declares a variable recursive_hash to be a new Hash and begins a block to be recursive_hash's default_proc. The block is passed two objects: h, which is the Hash instance the key lookup is being performed on, and k, the key being looked up.

Line 2 sets the default value in the hash to a new Hash instance. The default behavior for this hash is supplied by passing a Proc created from the default_proc of the hash the lookup is occurring in; ie, the default proc the block itself is defining.

Here's an example from an IRB session:

irb(main):011:0> recursive_hash = Hash.new do |h,k|
irb(main):012:1* h[k] = Hash.new(&h.default_proc)
irb(main):013:1> end
=> {}
irb(main):014:0> recursive_hash[:foo]
=> {}
irb(main):015:0> recursive_hash
=> {:foo=>{}}

When the hash at recursive_hash[:foo] was created, its default_proc was supplied by recursive_hash's default_proc. This has two effects:

  1. The default behavior for recursive_hash[:foo] is the same as recursive_hash.
  2. The default behavior for hashes created by recursive_hash[:foo]'s default_proc will be the same as recursive_hash.

So, continuing in IRB, we get the following:

irb(main):016:0> recursive_hash[:foo][:bar]
=> {}
irb(main):017:0> recursive_hash
=> {:foo=>{:bar=>{}}}
irb(main):018:0> recursive_hash[:foo][:bar][:zap]
=> {}
irb(main):019:0> recursive_hash
=> {:foo=>{:bar=>{:zap=>{}}}}

ruby idiom for update or insert a hashmap

Rather then what you did, a better way is:

hashmap = Hash.new{|h, k| h[k] = 0}

Then you only need to do:

hashmap[key] += 1

What is `hash` in ruby?

TL;DR – it's the hash value for Ruby's top-level object, equivalent to self.hash.

Here's a little debugging help:

irb(main):001:0> hash
#=> 3220857809431415791

irb(main):002:0> defined? hash
#=> "method"

irb(main):003:0> method(:hash)
#=> #<Method: Object(Kernel)#hash>

You can now lookup Object#hash1 online:

http://ruby-doc.org/core-2.3.1/Object.html#method-i-hash

Or in IRB:

irb(main):004:0> help "Object#hash"
= Object#hash

(from ruby core)
------------------------------------------------------------------------------
obj.hash -> fixnum

------------------------------------------------------------------------------

Generates a Fixnum hash value for this object. This function must have the
property that a.eql?(b) implies a.hash == b.hash.

The hash value is used along with #eql? by the Hash class to determine if two
objects reference the same hash key. Any hash value that exceeds the capacity
of a Fixnum will be truncated before being used.

The hash value for an object may not be identical across invocations or
implementations of Ruby. If you need a stable identifier across Ruby
invocations and implementations you will need to generate one with a custom
method.


#=> nil
irb(main):005:0>

1 Object(Kernel)#hash actually means that hash is defined in Kernel, but as stated in the documentation for Object:

Although the instance methods of Object are defined by the Kernel module, we have chosen to document them here for clarity.

Hashing in Ruby

One possible solution is to use select:

Returns a new hash consisting of entries for which the block returns true. If no block is given, an enumerator is returned instead.

Example:

h = {foo: 0, bar: 1, baz: 2}
h.select {|key, value| value < 2 } # => {:foo=>0, :bar=>1}

In your case:

input_hash.select { |k, v| k.is_a? String } # => {"100" => "gg", "str" => 10, "cruise" => 55, "tea" => 1}

Reference: https://ruby-doc.org/core-3.1.2/Hash.html#method-i-select

What are = and : in Ruby?


Lexer/Parser Tokens

The symbols you're referencing aren't methods or operators, they are lexer/parser tokens used to interpret the syntax of your source code. The hashrocket is defined as the tASSOC association token, which is used to associate things such as key/value pairs or exception stack traces.

The colon has several uses in Ruby, but IIRC Ruby 2.x introduced the postfix colon as syntactic sugar for tASSOC when the left-hand side is a Symbol. I'm less sure about how the token is defined or parsed in complex cases—assoc is the most likely bet for this example—but for practical purposes you can simply think of a: 1 as semantically equivalent to :a => 1.

You can also use Ripper#sexp to examine your source code to see how the lines will be parsed by the interpreter. For example:

require 'ripper'

pp Ripper.sexp "{a: 1}"
[:program,
[[:hash,
[:assoclist_from_args,
[[:assoc_new, [:@label, "a:", [1, 1]], [:@int, "1", [1, 4]]]]]]]]
#=> [:program, [[:hash, [:assoclist_from_args, [[:assoc_new, [:@label, "a:", [1, 1]], [:@int, "1", [1, 4]]]]]]]]

pp Ripper.sexp "{:a => 1}"
[:program,
[[:hash,
[:assoclist_from_args,
[[:assoc_new,
[:symbol_literal, [:symbol, [:@ident, "a", [1, 2]]]],
[:@int, "1", [1, 7]]]]]]]]
#=> [:program, [[:hash, [:assoclist_from_args, [[:assoc_new, [:symbol_literal, [:symbol, [:@ident, "a", [1, 2]]]], [:@int, "1", [1, 7]]]]]]]]

In both cases, you can see that the S-expression is using the colon to build an "assoc_new" subexpression. For further drill-down, you'd have to refer to the Ruby source tree.

See Also

  • lexer.rb
  • parse.y


Related Topics



Leave a reply



Submit