Is Order of a Ruby Hash Literal Guaranteed

Is order of a Ruby hash literal guaranteed?

There are couple of locations where this could be specified, i.e. a couple of things that are considered "The Ruby Language Specification":

  • the ISO Ruby Language Specification
  • the RubySpec project
  • the YARV testsuite
  • The Ruby Programming Language book by matz and David Flanagan

The ISO spec doesn't say anything about Hash ordering: it was written in such a way that all existing Ruby implementations are automatically compliant with it, without having to change, i.e. it was written to be descriptive of current Ruby implementations, not prescriptive. At the time the spec was written, those implementations included MRI, YARV, Rubinius, JRuby, IronRuby, MagLev, MacRuby, XRuby, Ruby.NET, Cardinal, tinyrb, RubyGoLightly, SmallRuby, BlueRuby, and others. Of particular interest are MRI (which only implements 1.8) and YARV (which only implements 1.9 (at the time)), which means that the spec can only specify behavior which is common to 1.8 and 1.9, which Hash ordering is not.

The RubySpec project was abandoned by its developers out of frustration that the ruby-core developers and YARV developers never recognized it. It does, however, (implicitly) specify that Hash literals are ordered left-to-right:

new_hash(1 => 2, 4 => 8, 2 => 4).keys.should == [1, 4, 2]

That's the spec for Hash#keys, however, the other specs test that Hash#values has the same order as Hash#keys, Hash#each_value and Hash#each_key has the same order as those, and Hash#each_pair and Hash#each have the same order as well.

I couldn't find anything in the YARV testsuite that specifies that ordering is preserved. In fact, I couldn't find anything at all about ordering in that testsuite, quite the opposite: the tests go to great length to avoid depending on ordering!

The Flanagan/matz book kinda-sorta implicitly specifies Hash literal ordering in section 9.5.3.6 Hash iterators. First, it uses much the same formulation as the docs:

In Ruby 1.9, however, hash elements are iterated in their insertion order, […]

But then it goes on:

[…], and that is the order shown in the following examples:

And in those examples, it actually uses a literal:

h = { :a=>1, :b=>2, :c=>3 }

# The each() iterator iterates [key,value] pairs
h.each {|pair| print pair } # Prints "[:a, 1][:b, 2][:c, 3]"

# It also works with two block arguments
h.each do |key, value|
print "#{key}:#{value} " # Prints "a:1 b:2 c:3"
end

# Iterate over keys or values or both
h.each_key {|k| print k } # Prints "abc"
h.each_value {|v| print v } # Prints "123"
h.each_pair {|k,v| print k,v } # Prints "a1b2c3". Like each

In his comment, @mu is too short mentioned that

h = { a: 1, b: 2 } is the same as h = { }; h[:a] = 1; h[:b] = 2

and in another comment that

nothing else would make any sense

Unfortunately, that is not true:

module HashASETWithLogging
def []=(key, value)
puts "[]= was called with [#{key.inspect}] = #{value.inspect}"
super
end
end

class Hash
prepend HashASETWithLogging
end

h = { a: 1, b: 2 }
# prints nothing

h = { }; h[:a] = 1; h[:b] = 2
# []= was called with [:a] = 1
# []= was called with [:b] = 2

So, depending on how you interpret that line from the book and depending on how "specification-ish" you judge that book, yes, ordering of literals is guaranteed.

Position of key/value pairs in a hash in Ruby (or any language)

Generally hashes (or dictionaries, associative arrays etc...) are considered unordered data structures.

From Wikipedia

In addition, associative arrays may also include other operations such
as determining the number of bindings or constructing an iterator to
loop over all the bindings. Usually, for such an operation, the order
in which the bindings are returned may be arbitrary.

However since Ruby 1.9, hash keys maintain the order in which they were inserted in Ruby.

The answer is right at the top of the Ruby documentation for Hash

Hashes enumerate their values in the order that the corresponding keys
were inserted.

In Ruby you can test it yourself pretty easily

key_indices = {
1000 => 0,
900 => 1,
500 => 2,
400 => 3,
100 => 4,
90 => 5,
50 => 6,
40 => 7,
10 => 8,
9 => 9,
5 => 10,
4 => 11,
1 => 12
}

1_000_000.times do
key_indices.each_with_index do |key_val, i|
raise if key_val.last != i
end
end

Does Set in Ruby always preserve insertion order?

In Ruby 1.9: yes. In Ruby 1.8: probably not.

Set uses a Hash internally; and since hashes are insertion-ordered in 1.9, you're good to go!

As mu is too short points out, this is an implementation detail and could change in the future (though unlikely). Thankfully, the current implementation of Set is pure ruby, and could be adapted into an OrderedSet in the future if you like

Ruby pop an element from a hash table?

If you know the key, just use delete directly
if the hash doesn't contain the key, you will get nil back, otherwise you will get whatever was stored there

from the doc you linked to:

h = { "a" => 100, "b" => 200 }
h.delete("a") #=> 100
h.delete("z") #=> nil
h.delete("z") { |el| "#{el} not found" } #=> "z not found"

There is also shift which deletes and returns a key-value pair:

hsh = Hash.new

hsh['bb'] = 42
hsh['aa'] = 23
hsh['cc'] = 65

p hsh.shift

=> ["bb", 42]

As can be seen, the order of a hash is the order of insertion, not the key or value. From the doc

Hashes enumerate their values in the order that the corresponding keys were inserted.

Ruby GraphQL: How to order arguments in an option?

The ruby Hash instances are not ordered (see Is order of a Ruby hash literal guaranteed?)


To leverage the optional multi sorting options in GraphQL input types, I usually use the following structure:

  • 1 enum to contain all filterable/sortable/searchable field of a resource (ex: UserField)
  • 1 enum to contain the 2 sort directions (asc and desc)
  • 1 field accepting an optional list of { field: UserField!, sortDir: SortDir! } inputs.

This then enables the API consumers to simply do queries like:

allUsers(sort_by: [{field: username, sortDir: desc}, {field: id, sortDir: asc}]) {
# ...
}

And this pattern can be easily re-used for searching and filtering:

allUsers(search: [{field: username, comparator: like, value: 'bob'}]) {}
allUsers(search: [{field: age, comparator: greater_than, value: '22'}]) {} # type casting is done server-side
allUsers(search: [{field: username, comparator: equal, value: 'bob'}]) {} # equivalent of filtering

Eventually, with further deeper work you can allow complex and/or for the input:

allUsers(
search: [
{
left: {field: username, comparator: like, value: 'bob'}
operator: and
right: {field: dateOfBirth, comparator: geater_than, value: '2001-01-01'}
}
]
)

Disclaimer: the last example above is one of the many things I want to implement in my GQL API but I haven't had the time to think it through yet, it's just a draft

Fill an array of hash with missing values

If arr is your array of hashes, you could construct the desired array in two steps.

require 'date'
date_fmt = "%B %Y"
first_month, last_month = arr.flat_map do |g|
g[:data].keys
end.map { |s| Date.strptime(s, date_fmt) }.minmax
#=> [#<Date: 2020-05-01 ((2458971j,0s,0n),+0s,2299161j)>,
# #<Date: 2020-11-01 ((2459155j,0s,0n),+0s,2299161j)>]

h = (first_month..last_month).map do |d|
d.strftime(date_fmt)
end.product([0]).to_h
#=> {"May 2020"=>0, "June 2020"=>0, "July 2020"=>0, "August 2020"=>0,
# "September 2020"=>0, "October 2020"=>0, "November 2020"=>0}
arr.map { |g| g.merge(:data => h.merge(g[:data])) }
#=> [
# {
# :name=>"Activity 1",
# :data=>{
# "May 2020"=>37, "June 2020"=>17, "July 2020"=>9,
# "August 2020"=>18, "September 2020"=>0,
# "October 2020"=>0, "November 2020"=>0
# }
# },
# {
# :name=>"Activity 2",
# :data=>{
# "May 2020"=>3, "June 2020"=>0, "July 2020"=>0,
# "August 2020"=>0, "September 2020"=>0,
# "October 2020"=>0, "November 2020"=>0
# }
# },
# {
# :name=>"Activity 3",
# :data=>{
# "May 2020"=>0, "June 2020"=>0, "July 2020"=>5,
# "August 2020"=>0, "September 2020"=>0,
# "October 2020"=>0, "November 2020"=>11
# }
# }
# ]

See Enumerable#flat_map, Date::strptime, Array#minmax, Date#strftime, Array#product and Hash#merge. See also DateTime#strptime for date formatting directives.

Note that in the calculation of first_month and last_month,

[#<Date: 2020-05-01 ((2458971j,0s,0n),+0s,2299161j)>,
#<Date: 2020-11-01 ((2459155j,0s,0n),+0s,2299161j)>].
map { |d| d.strftime(date_fmt) }
#=> ["May 2020", "November 2020"]

Why can't I access contents of flash when it's a hash?

Closer inspection revealed that, while the keys were indeed set, they'd been converted from symbols to strings.

To fix this I had to change line 4 of the controller from symbols to strings:

- message, detail = content[:message], content[:detail] if content.class == Hash
- message, detail = content['message'], content['detail'] if content.class == Hash

If I understand correctly, this is a result of flashes being stored in the session and the session object being stored in cookies as JSON objects. JSON doesn't support symbolised keys.

As an experiment I tried setting matching string and symbol keys. If you try doing both in one assignment, Ruby takes the first key and the second value (with a warning):

irb> content = { message: 'Symbol key first', 'message': 'String key second' }
=> warning: key :message is duplicated and overwritten on line X
=> {:message=>"String key second"}

But if you deliberately duplicate the keys in a hash passed to flash, whichever one is defined last "wins" (in my limited testing, but it makes sense given hashes are most likely iterated in insertion order):

symbol_first = {}
symbol_first[:message] = 'Symbol wins'
symbol_first['message'] = 'String wins'
flash[:alert] = symbol_first # 'String wins' is displayed

string_first = {}
string_first['message'] = 'String wins'
string_first[:message] = 'Symbol wins'
flash[:alert] = string_first # 'Symbol wins' is displayed

In Ruby what is the meaning of colon after identifier in a Hash?

The colon in this context denotes a literal Hash.

factory is the Hash key, :user is the value.

The alternative syntax is :factory => :user. They mean the same thing.

Best way to pretty print a hash

require 'pp'
pp my_hash

Use pp if you need a built-in solution and just want reasonable line breaks.

Use awesome_print if you can install a gem. (Depending on your users, you may wish to use the index:false option to turn off displaying array indices.)



Related Topics



Leave a reply



Submit