Ruby Koans: Why Convert List of Symbols to Strings

Ruby Koans: Why convert list of symbols to strings

This has to do with how symbols work. For each symbol, only one of it actually exists. Behind the scenes, a symbol is just a number referred to by a name (starting with a colon). Thus, when comparing the equality of two symbols, you're comparing object identity and not the content of the identifier that refers to this symbol.

If you were to do the simple test :test == "test", it will be false. So, if you were to gather all of the symbols defined thus far into an array, you would need to convert them to strings first before comparing them. You can't do this the opposite way (convert the string you want to compare into a symbol first) because doing that would create the single instance of that symbol and "pollute" your list with the symbol you're testing for existence.

This is a bit of an odd one, because you have to test for the presence of a symbol without accidentally creating that symbol during the test. You usually don't see code like that.

Ruby: Why symbols change to strings when using puts instead of print?

When you call puts, what really gets called is the rb_io_puts C function, which basically works like this:

  • If there is no argument, output a newline.
  • For each argument check if it's of type string (T_STRING in Ruby C lingo) and if yes, call rb_io_write with it. Also, if the string was of length zero or didn't finish in a newline, add a \n.
  • If the argument is an array, recursively call io_puts_ary on it.
  • In any other case, call rb_obj_as_string on the argument, which basically is the low-level equivalent of to_s.

So when you puts [:a, :b, :c], you'll hit the third case and io_puts_ary will take over. Long story short this will do something similar as what I described above, and will call rb_obj_as_string on each element and output it followed by a newline.

Different behavior of strings and symbols?

You can think of symbols as self-referential interned strings - that is, only one copy of a given symbol will ever exist. This is also true of some objects like Fixnum instances, booleans, or nil, as well. They are not garbage collected, are not duplicable, and are not mutable.

Strings, on the other hand, are garbage collected, are duplicable, and are mutable. Every time you declare a string, a new object is allocated.

How to refactor this Ruby code aiming to change an hash's symbols into strings?

data = Hash[options.map{ |k,v| [k.to_s,v] }]

For a hash large enough to be interesting, there isn't a significant difference between the answers

require 'benchmark'
options = Hash[('aaaa'..'zzzz').map{|i| [i.to_sym,i]}]
Benchmark.bm(100) do |x|
x.report("map") {Hash[options.map{ |k,v| [k.to_s,v] }] }
x.report("zip") {Hash[options.keys.map(&:to_s).zip(options.values)]}
x.report("inject") {options.inject({}) { |h, (k, v)| h[k.to_s] = v; h }}
end

user system total real
map 3.490000 0.090000 3.580000 ( 4.049015)
zip 3.780000 0.020000 3.800000 ( 3.925876)
inject 3.710000 0.110000 3.820000 ( 4.289286)

Ruby Send Method, iterating as symbols instead of strings

Obviously, the two versions are not equivalent. The first one will call a method whose name is based on the content of the variable k. In the second version, the variable k is never used, it will simply call the method k over and over and over again.

IOW: the first version will call a different method on each iteration of the loop, the second one will call the same method on every iteration of the loop.

You can, of course, use symbols in exactly the same way you use strings here:

def initialize(attributes = {})
attributes.each do |k,v|
self.send(:"#{k}=", value)
end
end

Checking if symbol is present in the array with include?

Let us go through a slightly modified version of your test code as it is seen by irb and as a stand alone script:

def test_method;end
symbols = Symbol.all_symbols # This is already a "fixed" array, no need for map
puts symbols.include?(:test_method)
puts symbols.include?('test_method_nonexistent'.to_sym)
puts symbols.include?(:test_method_nonexistent)
eval 'puts symbols.include?(:really_not_there)'

When you try this in irb, each line will be parsed and evaluated before the next line. When you hit the second line, symbols will contain :test_method because def test_method;end has already been evaluated. But, the :test_method_nonexistent symbol hasn't been seen anywhere when we hit line 2 so lines 4 and 5 will say "false". Line 6 will, of course, give us another false because :really_not_there doesn't exist until after eval returns. So irb says this:

true
false
false
false

If you run this as a Ruby script, things happen in a slightly different order. First Ruby will parse the script into an internal format that the Ruby VM understands and then it goes back to the first line and starts executing the script. When the script is being parsed, the :test_method symbol will exist after the first line is parsed and :test_method_nonexistent will exist after the fifth line has been parsed; so, before the script runs, two of the symbols we're interested in are known. When we hit line six, Ruby just sees an eval and a string but it doesn't yet know that the eval cause a symbol to come into existence.

Now we have two of our symbols (:test_method and :test_method_nonexistent) and a simple string that, when fed to eval, will create a symbol (:really_not_there). Then we go back to the beginning and the VM starts running code. When we run line 2 and cache our symbols array, both :test_method and :test_method_nonexistent will exist and appear in the symbols array because the parser created them. So lines 3 through 5:

puts symbols.include?(:test_method)
puts symbols.include?('test_method_nonexistent'.to_sym)
puts symbols.include?(:test_method_nonexistent)

will print "true". Then we hit line 6:

eval 'puts symbols.include?(:really_not_there)'

and "false" is printed because :really_not_there is created by the eval at run-time rather than during parsing. The result is that Ruby says:

true
true
true
false

If we add this at the end:

symbols = Symbol.all_symbols
puts symbols.include?('really_not_there'.to_sym)

Then we'll get another "true" out of both irb and the stand-alone script because eval will have created :really_not_there and we will have grabbed a fresh copy of the symbol list.

Ruby Koans #75 test_constants_become_symbols, correct answer?

NOTE: the following answer only applies to environments like irb, where Ruby code is being executed line by line. When executing code in a file, Ruby scans the entire file for symbols before executing anything, so the following details are not accurate. I've not deleted this answer because it exposes an interesting edge case, but see @GlichMr's answer for a better explanation of the problem.

You can safely do the following, because Symbol.all_symbols returns a copy of the array of symbols, not a reference.

assert_equal true, all_symbols.include?(:RubyConstant)

I think that is the intended answer to the koan, and it's why all_symbols is defined rather than calling Symbol.all_symbols directly. For some evidence, see the following:

>> X = 1
=> 1
>> all_symbols = Symbol.all_symbols; nil
=> nil
>> Y = 2
=> 2
>> all_symbols.include?(:X)
=> true
>> all_symbols.include?(:Y)
=> false

Using String#to_sym would make it possible to make these calls against Symbol.all_symbols directly, but is not necessary for solving this koan.

Why does HAML code usually use Ruby symbols (:http_equiv, :content) instead of plain strings (http_equiv, content)?

The structure you are referring to here: {:http_equiv=>"Content-Type", :content=>"text/html; charset=utf-8"} is a Hash. Here are some very good answers to the question "Why does Ruby use symbols as keys in Hashes?"

Why use symbols as hash keys in Ruby?

Why is it not a good idea to dynamically create a lot of symbols in ruby (for versions before 2.2)?

Symbols are like strings but they are immutable - they can't be modified.

They are only put into memory once, making them very efficient to use for things like keys in hashes but they stay in memory until the program exits. This makes them a memory hog if you misuse them.

If you dynamically create lots of symbols, you are allocating a lot of memory that can't be freed until your program ends. You should only dynamically create symbols (using string.to_sym) if you know you will:

  1. need to repeatedly access the symbol
  2. not need to modify them

As I said earlier, they are useful for things like hashes - where you care more about the identity of the variable than its value. Symbols, when correctly used, are a readable and efficient way to pass around identity.

I will explain what I mean about the immutability of symbols RE your comment.

Strings are like arrays; they can be modified in place:

12:17:44 ~$ irb
irb(main):001:0> string = "Hello World!"
=> "Hello World!"
irb(main):002:0> string[5] = 'z'
=> "z"
irb(main):003:0> string
=> "HellozWorld!"
irb(main):004:0>

Symbols are more like numbers; they can't be edited in place:

irb(main):011:0> symbol = :Hello_World
=> :Hello_World
irb(main):012:0> symbol[5] = 'z'
NoMethodError: undefined method `[]=' for :Hello_World:Symbol
from (irb):12
from :0

Ruby - Mongoid - cannot query date range, error: keys must be strings or symbols

I spoke to the developer of the gem and he says it is supported in rails 3.1 +
Also, we made this work by using a different syntax (similar to javascript).

scope :between, ->(from, to){ where(:created_at => {'$gte' =>  Time.parse(from.to_s)}).and(:created_at => {'$lte' => Time.parse((to + 1.day).to_s) - 1.second }) }

And it turns out this is supported in rails 3.0



Related Topics



Leave a reply



Submit