Do Ruby 1.8 and 1.9 have the same hash code for a string?
Fortunately, the answer is easy because they do not:
~$ ruby1.8 -e 'p "hello world".hash'
444332266
~$ ruby1.9 -e 'p "hello world".hash'
-194819219
If you use the builtin hash method, I would recommend having a script as part of your build process that generates the necessary hashcodes. Note that they are not guaranteed to be the same even from one machine to the next.
If you need consistent hashing, use something like CRC32 or SHA1:
>> require 'zlib'
>> Zlib.crc32 "hello world"
=> 222957957
>> require 'digest'
>> Digest::SHA1.hexdigest "hello world"
=> "2aae6c35c94fcfb415dbe95f408b9ce91ee846ed"
>> Digest::MD5.hexdigest "hello world"
=> "5eb63bbbe01eeed093cb22bb8f5acdc3"
They have quite different purposes, but CRC32 has the advantage of returning a 32-bit number and being quite fast, while SHA1 is an 80-bit number but more secure. (I’m assuming this is not for cryptographic purposes, but look into SHA-256 if you need it.)
Ruby: How come the same strings have different hashcodes?
The results of these expressions are not all the same data. Ruby 1.8 integers contain character numbers for single character indexing. This has been changed in Ruby 1.9, but slice(0)
returns the first character of the string '@'
, not 'a'
.
In Ruby 1.8 (using irb
):
irb(main):001:0> test = 'a'
=> "a"
irb(main):002:0> test2 = '@a'.slice(0)
=> 64
irb(main):003:0> test3 = '@a'[1]
=> 97
irb(main):004:0> test.hash
=> 100
irb(main):005:0> test2.hash
=> 129
irb(main):006:0> test3.hash
=> 195
In Ruby 1.9.1:
irb(main):001:0> test = 'a'
=> "a"
irb(main):002:0> test2 = '@a'.slice(0)
=> "@"
irb(main):003:0> test3 = '@a'[1]
=> "a"
irb(main):004:0> test.hash
=> 1365935838
irb(main):005:0> test2.hash
=> 347394336
irb(main):006:0> test3.hash
=> 1365935838
String length difference between ruby 1.8 and 1.9
This is a Unicode issue. The string you are using contains characters outside the ASCII range, and the UTF-8 encoding that is frequently used encodes those as 2 (or more) bytes.
Ruby 1.8 did not handle Unicode properly, and length
simply gives the number of bytes in the string, which results in fun stuff like:
"ą".length
=> 2
Ruby 1.9 has better Unicode handling. This includes length
returning the actual number of characters in the string, as long as Ruby knows the encoding:
"ä".length
=> 1
One possible workaround in Ruby 1.8 is using regular expressions, which can be made Unicode aware:
"ą".scan(/./mu).size
=> 1
Ruby make 1.8 Hash#select behave like 1.9 Hash#select
Hash[{1=>2,3=>4}.select{|k,v| v>2}]
Consistent String#hash based only on the string's content
there are lot of such functionality in ruby's digest module: http://ruby-doc.org/stdlib/libdoc/digest/rdoc/index.html
simple example:
require 'digest/sha1'
Digest::SHA1.hexdigest("some string")
Why do String hashes change?
Same string doesn't return same hash between two sessions of Ruby, only in the current session.
➜ tmp pry
[1] pry(main)> "foo".hash
=> -3172192351909719463
[2] pry(main)> exit
➜ tmp pry
[1] pry(main)> "foo".hash
=> 2138900251898429379
[2] pry(main)> "foo".hash
=> 2138900251898429379
Ruby make 1.8 Hash#select behave like 1.9 Hash#select
Hash[{1=>2,3=>4}.select{|k,v| v>2}]
Allowing for Ruby 1.9's hash syntax?
Even in Ruby < 1.9, you could use symbols for keys. For example:
# Ruby 1.8.7
settings = { :host => "localhost" }
puts settings[:host] #outputs localhost
settings.keys[0].class # => Symbol
Ruby 1.9 changes the way that you create hashes. It takes the key and converts it to a symbol for you, while eliminating the need for a hash rocket.
# Ruby 1.9.2
settings = { host: "localhost" }
settings[:host] # => "localhost"
settings.keys[0].class # => Symbol
In both cases, if I try to access settings[:name]
with settings["name"]
, I'm going to get nil. All Ruby 1.9 does is allow for a new way of creating hashes. To answer your question, you cannot, as far as I know, use the new {key: value}
syntax if you want backwards compatibility with Ruby 1.8.
Library to get a String to behave like in 1.9 in 1.8
Just did it myself...
gem install string19
String19('áßð').size == 3
String19('áßð').index('ð') == 2
etc.
not all methods supported, but easy to add more
alternatives to Hash#index that works without warning in both Ruby 1.8 and 1.9
You could also invert the hash:
{ :hello => :world }.invert[:world] # ==> :hello
No monkey-patching or external dependencies, but probably less efficient for most purposes.
Related Topics
Ruby Way to Group Anagrams in String Array
Nokogiri Error When Running Bundle Install
Installing Ruby 2.3 on Wsl (Windows Subsystem for Linux)
How to Force One Field in Ruby's CSV Output to Be Wrapped with Double-Quotes
Find Memory Leak in a Ruby on Rails Project
How to Add Confirm Message with Link_To Ruby on Rails
Clarification on the Ruby << Operator
Ruby Split String by Repeating Characters or a Space
Ruby: How to Find Non-Unique Elements in Array and Print Each with Number of Occurrences
Convert String to Class Name Without Using Eval in Ruby
Get Link and Href Text from HTML Doc with Nokogiri & Ruby
Why Does Openuri Treat Files Under 10Kb in Size as Stringio
Polymorphic Association with Multiple Associations on the Same Model
How to Convert a JSON String to an Object
Sinatra - Response.Set_Cookie Doesn't Work
Set the Display Precision of a Float in Ruby