Which equality test does Ruby's Hash use when comparing keys?
Hash uses key.eql? to test keys for equality. If you need to use
instances of your own classes as keys in a Hash, it is recommended
that you define both the eql? and hash methods. The hash method must
have the property that a.eql?(b) implies a.hash == b.hash.
So...
class A
attr_reader :x
def initialize(inner)
@inner=inner
end
def x; @inner.x; end
def ==(other)
@inner.x==other.x
end
def eql?(other)
self == other
end
def hash
x.hash
end
end
How exactly does #eql? rely on #hash?
This seems to be actually the other way around. eql?
is expected to return true
for objects returning the same hash
value, but it is not defined to compare these values. You are simply expected to override both.
The eql? method returns true if obj and other refer to the same hash key. This is used by Hash to test members for equality. For any pair of objects where eql? returns true, the hash value of both objects must be equal. So any subclass that overrides eql? should also override hash appropriately.
What is the big-O complexity to test for the equality of two hashes?
Hash comparisons are, on average, O(keys) + ∑Cost(value).
If the values are simple then comparing each value is O(1). The hash comparison is O(keys) + ∑O(1) which is O(keys).
Ruby does a few extra things, but the basic algorithm is simple.
- If they're the same hash, they're equal.
- If they have a different number of keys, they're not equal.
- If any pair in hash1 is not in hash2, they're not equal.
- They're equal.
def hash_eq(hash1, hash2)
# O(1)
return true if hash1.equal?(hash2)
# O(1)
return false if hash1.size != hash2.size
# O(keys)
hash1.each {|k,v|
# O(1) O(value)
return false if hash2[k] != v
}
return true
end
Checking object equality is just comparing two pointers, so O(1).
Ruby hashes store their size in num_entries
, so comparing their size is O(1).
Iterating through the keys and values of a hash is O(keys), you're basically walking a array.
Looking up a value in a hash is O(1). Worst case is O(n), but this is exceedingly rare in modern hashes especially after it was found to be a security issue.
Finally, comparing the values have their own cost. For simple values like numbers this is O(1). Anything else has its own cost which I'll call O(val). This can be large strings, which are O(n) where n is the size of the string, or complex nested data structures. We need to do this for each value and each value carries its own cost. I'll write this as the sum of all those costs: ∑Cost(value).
Thus O(1) + O(1) + O(keys) + ∑Cost(value). The O(1)s are swamped by O(keys) and disappear, thus O(keys) + ∑Cost(value).
A simple benchmark bears this out.
require 'benchmark'
def benchmark_hash_eq(&block)
# We can't use h2 = h1.clone because that will not clone
# the values. Then Ruby can cheat and use object equality to
# compare the values.
h1 = block.call
h2 = block.call
puts "Size: #{h1.size}"
puts Benchmark.measure {
10_000.times { h1 == h2 }
}
end
puts "With number values"
benchmark_hash_eq do
Hash[(1..100).collect { |i| [i, i] }]
end
benchmark_hash_eq do
Hash[(1..1_000).collect { |i| [i, i] }]
end
benchmark_hash_eq do
Hash[(1..10_000).collect { |i| [i, i] }]
end
puts "With large, equal size strings"
benchmark_hash_eq do
Hash[(1..100).collect { |i| [i, "a" * 1000] }]
end
benchmark_hash_eq do
Hash[(1..1_000).collect { |i| [i, "a" * 1000] }]
end
benchmark_hash_eq do
Hash[(1..10_000).collect { |i| [i, "a" * 1000] }]
end
With number values
Size: 100
0.019237 0.000043 0.019280 ( 0.019306)
Size: 1000
0.195047 0.000515 0.195562 ( 0.196367)
Size: 10000
1.913112 0.003115 1.916227 ( 1.920030)
With large, equal size strings
Size: 100
0.065414 0.000225 0.065639 ( 0.065839)
Size: 1000
0.669863 0.001145 0.671008 ( 0.672279)
Size: 10000
10.041987 0.013201 10.055188 ( 10.066840)
Perfectly linear, except that last one which could be due to the hash size reaching some memory threshold.
Note that Ruby can do the comparison in parallel. This doesn't change the complexity, but might make it perform faster.
How to test order-conscious equality of hashes
Probably the easiest is to compare the corresponding arrays.
h1.to_a == h2.to_a
How can you easily test hash equality in Ruby when you only care about intersecting keys?
def compare_intersecting_keys(a, b)
(a.keys & b.keys).all? {|k| a[k] == b[k]}
end
Use like this:
compare_intersecting_keys(hash_x, hash_y) # => true
compare_intersecting_keys(hash_p, hash_q) # => false
If you want it monkey-patched:
class Hash
def compare_intersection(other)
(self.keys & other.keys).all? {|k| self[k] == other[k]}
end
end
Hash lookup fails when set key and lookup pass an equality test
Based on those links, is hash for both objects identical?
We've had issues in a multithreaded JRuby environment where keys in two separate runtimes satisfy == but fail to retrieve keys due to differing object_ids of the underlying keys.
It could be a similar class of problem - two objects A and B which satisfy A.eql? B but do not satisfy A is B
How do I compare two hashes?
You can compare hashes directly for equality:
hash1 = {'a' => 1, 'b' => 2}
hash2 = {'a' => 1, 'b' => 2}
hash3 = {'a' => 1, 'b' => 2, 'c' => 3}
hash1 == hash2 # => true
hash1 == hash3 # => false
hash1.to_a == hash2.to_a # => true
hash1.to_a == hash3.to_a # => false
You can convert the hashes to arrays, then get their difference:
hash3.to_a - hash1.to_a # => [["c", 3]]
if (hash3.size > hash1.size)
difference = hash3.to_a - hash1.to_a
else
difference = hash1.to_a - hash3.to_a
end
Hash[*difference.flatten] # => {"c"=>3}
Simplifying further:
Assigning difference via a ternary structure:
difference = (hash3.size > hash1.size) \
? hash3.to_a - hash1.to_a \
: hash1.to_a - hash3.to_a
=> [["c", 3]]
Hash[*difference.flatten]
=> {"c"=>3}
Doing it all in one operation and getting rid of the difference
variable:
Hash[*(
(hash3.size > hash1.size) \
? hash3.to_a - hash1.to_a \
: hash1.to_a - hash3.to_a
).flatten]
=> {"c"=>3}
Ruby Compare 2 hashes
Two hashes with the same key/value pairs will be equal regardless of key order:
a = {:x => 1, :y => 2}
b = {:y => 2, :x => 1}
a == b
# => true
Related Topics
Make Rails Ignore Daylight Saving Time When Displaying a Date
Ruby: Array Contained in Array, Any Order
How to Get a Selenium/Ruby Bot to Wait Before Performing an Action
Simplest Way to Send Raw Byte-Arrays Using Ruby's Tcpsocket-Class
How Can Bundler/Gemfile Be Configured to Use Different Gem Sources During Development
Adding a Background Image in Ruby on Rails 2 in CSS
Consequences of Implementing To_Int and To_Str in Ruby
Cast Between String and Classname
Ruby - Calling Setters from Within an Object
Async Requests Using Sinatra Streaming API
Dynamic Active Record Store Accessors Based Off a User Form
Why Doesn't Relative_Require Work on Ruby 1.8.6
How to Convert a Ruby String with Brackets to an Array
Does Ruby 1.9.2 Have an Is_A? Function
Skip/Disable Force_Ssl for Particular Controller in Rails
Rails 3.1 Load CSS in Particular Order