Why Is Ruby String.Hash Inconsistent Across MAChines

Why is Ruby String.hash inconsistent across machines?

From a Ruby dev in the Ruby forum:

It is intended. Ruby 1.9 explicitly use session local random seed to
calculate a hash for strings (and some other objects).

This is because the implementation of Object#hash is different between
versions (like 1.9.1 and 1.9.2) and implementations (like JRuby,
Rubinius, IronRuby, and so on). We want people to write portable code
around Object#hash, so we did so.

You should use Digest::SHA256 or some other digest routines when you
want some hash value (message digest).

And follow-up from another dev:

Also, it helps to avoid some denial of service attacks, such as
registering hundreds and thousands of users with usernames that have
the same hash code.

Am I misunderstanding String#hash in Ruby?

If you run

ruby -e "puts '[deleted]'.hash"

several times, you will notice that the value is different. In fact, the hash value stays only constant as long as your Ruby process is alive. The reason for this is that String#hash is seeded with a random value. rb_str_hash (the C implementing function) uses rb_hash_start which uses this random seed which gets initialized every time Ruby is spawned.

You could use a CRC such as Zlib#crc32 for your purposes or you may want to use one of the message digests of OpenSSL::Digest, although the latter is overkill since for detection of duplicates you probably won't need the security properties.

Different hashes for the same object values in ruby

No, it should not!

Different Ruby implementations (like jRuby, Rubinius, MRI 1.8.x, MRI 1.9.x etc) are using different ways to generate hashes. For example, for some objects (like you own classes or Hash instances) runtime will assign uniq and random id while creating this object. If I am not wrong, MRI tight works with hashes based on memory addresses: http://rxr.whitequark.org/mri/source/gc.c?v=1.8.7-p370#2111

So you can not guarantee that every run of your code will use the same random values or the same memory addresses every time.

Also I suggest to use ruby-doc instead of apidock for Ruby internals: http://ruby-doc.org/core-2.0/Object.html#method-i-hash

The hash value for an object may not be identical across invocations or implementations of ruby. If you need a stable identifier across ruby invocations and implementations you will need to generate one with a custom method.

Hope it will help you!

Inconsistent printing with Ruby

Try

print opts.map{|k,v| k + '=' + v + "\n"}.join

The explanation is easy: With ruby 1.9 Array.to_s changed its behaviour.

An alternative:

puts opts.map{|k,v| k + '=' + v }.join("\n")

or

puts opts.map{|k,v| "#{k}=#{v}" }.join("\n")

I would prefer:

opts.each{|k,v| puts "#{k}=#{v}" }

And another version, but with another look:

opts.each{|k,v| puts "%-10s= %s" % [k,v]}

The result is:

one       = 1
two = 1
three = 0

(But the keys should be not longer then the length in %-10s.)

Purpose of avalanching

Avalanching is just a term to define the "difussion" of small changes on input to the final result, for criptographic hashes where non-reversability is a really crucial having similar inputs provide really different results is a desirable feature to avoid an approximation attack crack a single hash.
See more info about this at http://en.wikipedia.org/wiki/Avalanche_effect

I can not see why it uses that steps but it is using AND and XOR with the own shifted result to increase the diffusion, probably other values will perform similar but that will need a deeper analysis

Why does an equality check in Ruby behave this way with an OR expression?

Because (6 || 5) returns 6 (|| returns the first true condition), not 5:

[2] pry(main)> (6 || 5)
# => 6

So 5 == (6 || 5) is just the same as 5 == 6 which is, of course, false.

Is this the best way to grab common elements from a Hash of arrays?

Behold the power of inject! ;)

[[1,2,3],[1,3,5],[1,5,6]].inject(&:&)
=> [1]

As Jordan mentioned, if your version of Ruby lacks support for &-notation, just use

inject{|acc,elem| acc & elem}


Related Topics



Leave a reply



Submit