Why Does Shoveling a String into a Hash Cause This Result

Why does shoveling a string into a hash cause this result?

A hint as to the reason is in the name of the tests.

test_default_value_is_the_same_object is there to show you that when you ask for hash[:some_value_that_doesnt_exist_yet], by default, you get back the default value you specified -- which is the same object every time. By modifying that object, you modify it for every nonexistent key. Modifying hash[:one] also modifies hash[:two].

test_default_value_with_block shows the construction of a Hash using a block, which will be used to provide a new value for each key. When you do it like that, the values for hash[:one] and hash[:two] are distinct.

Why does hash a not return the same result for a[:b] vs a[b]?

No, it shouldn't, :b is a different object than 'b', so unless a is a hash with indifferent access, a[:b] will most likely return different result than a['b'].

BTW althouth Ruby is in fact dynamically typed, it's also quite strongly typed - implicit type conversion occurs rather rarely here.

Why would one use String keys for a hash over symbols

Because the source of the keys--the query string--is made up of strings, so searching through this string for keys, it is most directly convenient to index the hash via the strings.

Every Symbol that is created in the Ruby runtime is allocated and never released. There is a theoretical (but unlikely) DOS attack available by sending hundreds of thousands of requests with unique query string parameters. If these were symbolized, each request would slowly grow the runtime memory pool.

Strings, on the other hand, may be garbage collected. Thousands of unique strings handled across various requests will eventually go away, with no long-term impact.

Edit: Note that with Sinatra, symbols are also available for accessing the params hash. However, this is done by creating a hash that is indexed by strings, and converting symbols (in your code) to strings when you make a request. Unless you do something like the following:

params.each{ |key,_| key.to_sym }

...you are not at risk of any symbol pseudo-DOS attack.

Why are and += building a different string?

s1 << s2 appends the string s2 to s1, whereas s1 += s2, which expands to s1 = s1 + s2, creates a new object which becomes the new value of the variable s1.

Consider the following.

s1 = "ab"
s1.object_id #=> 70117580969460
s2 = "cd"

s1 << s2 #=> "abcd"
s1 #=> "abcd"
s1.object_id #=> 70117580969460

Compare that with:

s1 = "ab"
s1.object_id #=> 70117576935280
s2 = "cd"

s1 += s2 #=> s1 = s1 + s2 => "abcd"
s1 #=> "abcd"
s1.object_id #=> 70117576870900

Understanding how hash copy is behaving under the hood

I just needed a little push to understand, and thanks to @Stefan I think I can answer my own question. Breaking it down we have:

root = {}
base = root
puts root.object_id
=> 47193371579760
puts base.object_id
=> 47193371579760

So both root and base became a reference for the same object.

base[:a] = {}
base[:a].object_id
=> 47193372751820
base = base[:a]
puts base.object_id
=> 47193372751820
puts root.object_id
=> 47193371579760
puts root

base[:a] is a new hash object, and base assigned to it becomes this object while root keeps the reference for the old object that was assigned {:a=>{}}. That's why root doesn't change at the end.

Ruby hash default value behavior

The other answers seem to indicate that the difference in behavior is due to Integers being immutable and Arrays being mutable. But that is misleading. The difference is not that the creator of Ruby decided to make one immutable and the other mutable. The difference is that you, the programmer decided to mutate one but not the other.

The question is not whether Arrays are mutable, the question is whether you mutate it.

You can get both the behaviors you see above, just by using Arrays. Observe:

One default Array with mutation

hsh = Hash.new([])

hsh[:one] << 'one'
hsh[:two] << 'two'

hsh[:nonexistent]
# => ['one', 'two']
# Because we mutated the default value, nonexistent keys return the changed value

hsh
# => {}
# But we never mutated the hash itself, therefore it is still empty!

One default Array without mutation

hsh = Hash.new([])

hsh[:one] += ['one']
hsh[:two] += ['two']
# This is syntactic sugar for hsh[:two] = hsh[:two] + ['two']

hsh[:nonexistant]
# => []
# We didn't mutate the default value, it is still an empty array

hsh
# => { :one => ['one'], :two => ['two'] }
# This time, we *did* mutate the hash.

A new, different Array every time with mutation

hsh = Hash.new { [] }
# This time, instead of a default *value*, we use a default *block*

hsh[:one] << 'one'
hsh[:two] << 'two'

hsh[:nonexistent]
# => []
# We *did* mutate the default value, but it was a fresh one every time.

hsh
# => {}
# But we never mutated the hash itself, therefore it is still empty!

hsh = Hash.new {|hsh, key| hsh[key] = [] }
# This time, instead of a default *value*, we use a default *block*
# And the block not only *returns* the default value, it also *assigns* it

hsh[:one] << 'one'
hsh[:two] << 'two'

hsh[:nonexistent]
# => []
# We *did* mutate the default value, but it was a fresh one every time.

hsh
# => { :one => ['one'], :two => ['two'], :nonexistent => [] }

Ruby freeze method

  • freeze - prevents modification to the Hash (returns the frozen object)
  • [] - accesses a value from the hash
  • stat.class.name.underscore.to_sym - I assume this returns a lowercase, snake case version of the given object's class name (underscore is not in the standard library, so I'm not completely sure)
  • call invokes the lambda associated with stat.class.name.underscore.to_sym key.

For instance, passing ['foo', 'bar'] as the argument to track_for would invoke the send(stat[0], stat[1]) lambda.

letter count in a sentence using Ruby

"I am a good boy".scan(/\w/).inject(Hash.new(0)){|h, c| h[c] += 1; h}
# => {"I"=>1, "a"=>2, "m"=>1, "g"=>1, "o"=>3, "d"=>1, "b"=>1, "y"=>1}

Why do some Ruby methods like String#replace mutate copies of variables?

The issue here is not called recursion, and Ruby variables are not recursive (for any normal meaning of the word - i.e. they don't reference themselves, and you don't need recursive routines in order to work with them). Recursion in computer programming is when code calls itself, directly or indirectly, such as a function that contains a call to itself.

In Ruby, all variables point to objects. This is without exception - although there are some internal tricks to make things fast, even writing a=5 creates a variable called a and "points" it to the Fixnum object 5 - careful language design means you almost don't notice this happening. Most importantly, numbers cannot change (you cannot change a 5 into a 6, they are always different objects), so you can think that somehow a "contains" a 5 and get away with it even though technically a points to 5.

With Strings though, the objects can be changed. A step-by-step explanation of your example code might read like this:

a = 'red'

Creates a new String object with the contents "red", and points variable a at it.

b = a

Points variable b to same object as a.

b.replace('blue')

Calls the replace method on the object pointed to by b (and also pointed to by a) The method alters the contents of the String to "blue".

b = 'green'; 

Creates a new String object with the contents "green", and points variable b at it. The variables a and b now point to different objects.

print a 

The String object pointed to by a has contents "blue". So it is all working correctly, according to the language spec.

When will I ever use this?

All the time. In Ruby you use variables to point, temporarily, to objects, in order to call methods on them. The objects are the things you want to work with, the variables are the names in your code you use to reference them. The fact that they are separate can trip you up from time to time (especially in Ruby with Strings, where many other languages do not have this behaviour)

and does this mean I can't pass the value of "a" into another variable without any changes recursing back to "a"?

If you want to copy a String, there are a few ways to do it. E.g.

b = a.clone

or

b = "#{a}"

However, in practice you rarely just want to make direct copies of strings. You will want to do something else that is related to the goal of your code. Usually in Ruby, there will be a method that does the manipulation that you need and return a new String, so you would do something like this

b = a.something

In other cases, you actually will want changes to be made to the original object. It all depends on what the purpose of your code is. In-place changes to String objects can be useful, so Ruby supports them.

Furthermore it seems sometimes a method will recurse into "a" and sometimes it will cause "b" to become a new object_id.

This is never the case. No methods will change an object's identity. However, most methods will return a new object. Some methods will change an object's contents - it is those methods in Ruby that you need to be more aware of, due to possibility of changing data being used elsewhere - same is true in other OO languages, JavaScript objects are no exception here, they behave in the exact same way.



Related Topics



Leave a reply



Submit