What's the Difference Between a String and a Symbol in Ruby

What's the difference between a string and a symbol in Ruby?

The main difference is that multiple symbols representing a single value are identical whereas this is not true with strings. For example:

irb(main):007:0> :test.object_id
=> 83618
irb(main):008:0> :test.object_id
=> 83618
irb(main):009:0> :test.object_id
=> 83618

Those are three references to the symbol :test, which are all the same object.

irb(main):010:0> "test".object_id
=> -605770378
irb(main):011:0> "test".object_id
=> -605779298
irb(main):012:0> "test".object_id
=> -605784948

Those are three references to the string "test", but are all different objects.

This means that using symbols can potentially save a good bit of memory depending on the application. It is also faster to compare symbols for equality since they are the same object, comparing identical strings is much slower since the string values need to be compared instead of just the object ids.

As far as when to use which, I usually use strings for almost everything except things like hash keys where I really want a unique identifier, not a string.

What is the difference between strings and symbols in Ruby?

indexes First_name, :sortable => true

here rails treat First_name as a constant variable not the column.

so you can use

indexes :First_name, :sortable => true

or

indexes "First_name", :sortable => true

or

change the column First_name to first_name and then you can do this

indexes first_name, :sortable => true

string vs symbols rails

When using a Hash in Ruby, any object such a string or a symbol can be used as the key. However, :email and "email" are different objects so to look up an object you must use the same object that was used to store the value.

What might be confusing you is that Rails uses a HashWithIndifferentAccess in some places. This is a customised hash where keys :foo and "foo" are considered to be the same. This allows you to use e.g. params[:user] and params["user"] interchangeably, however that is not the general case.

As Stefan spotted, the reason why strings were being used for the keys in the parsed JSON object was just down to a small typo in the option: symbilize_names where it should be symbolize_names.

When to use symbols instead of strings in Ruby?

TL;DR

A simple rule of thumb is to use symbols every time you need internal identifiers. For Ruby < 2.2 only use symbols when they aren't generated dynamically, to avoid memory leaks.

Full answer

The only reason not to use them for identifiers that are generated dynamically is because of memory concerns.

This question is very common because many programming languages don't have symbols, only strings, and thus strings are also used as identifiers in your code. You should be worrying about what symbols are meant to be, not only when you should use symbols. Symbols are meant to be identifiers. If you follow this philosophy, chances are that you will do things right.

There are several differences between the implementation of symbols and strings. The most important thing about symbols is that they are immutable. This means that they will never have their value changed. Because of this, symbols are instantiated faster than strings and some operations like comparing two symbols is also faster.

The fact that a symbol is immutable allows Ruby to use the same object every time you reference the symbol, saving memory. So every time the interpreter reads :my_key it can take it from memory instead of instantiate it again. This is less expensive than initializing a new string every time.

You can get a list all symbols that are already instantiated with the command Symbol.all_symbols:

symbols_count = Symbol.all_symbols.count # all_symbols is an array with all 
# instantiated symbols.
a = :one
puts a.object_id
# prints 167778

a = :two
puts a.object_id
# prints 167858

a = :one
puts a.object_id
# prints 167778 again - the same object_id from the first time!

puts Symbol.all_symbols.count - symbols_count
# prints 2, the two objects we created.

For Ruby versions before 2.2, once a symbol is instantiated, this memory will never be free again. The only way to free the memory is restarting the application. So symbols are also a major cause of memory leaks when used incorrectly. The simplest way to generate a memory leak is using the method to_sym on user input data, since this data will always change, a new portion of the memory will be used forever in the software instance. Ruby 2.2 introduced the symbol garbage collector, which frees symbols generated dynamically, so the memory leaks generated by creating symbols dynamically it is not a concern any longer.

Answering your question:

Is it true I have to use a symbol instead of a string if there is at least two the same strings in my application or script?

If what you are looking for is an identifier to be used internally at your code, you should be using symbols. If you are printing output, you should go with strings, even if it appears more than once, even allocating two different objects in memory.

Here's the reasoning:

  1. Printing the symbols will be slower than printing strings because they are cast to strings.
  2. Having lots of different symbols will increase the overall memory usage of your application since they are never deallocated. And you are never using all strings from your code at the same time.

Use case by @AlanDert

@AlanDert: if I use many times something like %input{type: :checkbox} in haml code, what should I use as checkbox?

Me: Yes.

@AlanDert: But to print out a symbol on html page, it should be converted to string, shouldn't it? what's the point of using it then?

What is the type of an input? An identifier of the type of input you want to use or something you want to show to the user?

It is true that it will become HTML code at some point, but at the moment you are writing that line of your code, it is mean to be an identifier - it identifies what kind of input field you need. Thus, it is used over and over again in your code, and have always the same "string" of characters as the identifier and won't generate a memory leak.

That said, why don't we evaluate the data to see if strings are faster?

This is a simple benchmark I created for this:

require 'benchmark'
require 'haml'

str = Benchmark.measure do
10_000.times do
Haml::Engine.new('%input{type: "checkbox"}').render
end
end.total

sym = Benchmark.measure do
10_000.times do
Haml::Engine.new('%input{type: :checkbox}').render
end
end.total

puts "String: " + str.to_s
puts "Symbol: " + sym.to_s

Three outputs:

# first time
String: 5.14
Symbol: 5.07
#second
String: 5.29
Symbol: 5.050000000000001
#third
String: 4.7700000000000005
Symbol: 4.68

So using smbols is actually a bit faster than using strings. Why is that? It depends on the way HAML is implemented. I would need to hack a bit on HAML code to see, but if you keep using symbols in the concept of an identifier, your application will be faster and reliable. When questions strike, benchmark it and get your answers.

In Ruby, how to choose whether a symbol or string to be used in a given scenario?

a = :foo
b = :foo

a and b refer to the same object in memory (same identity)

a.object_id # => 898908
b.object_id # => 898908

Strings behave differently

a = 'foo'
b = 'foo'

a.object_id # => 70127643805220
b.object_id # => 70127643805200

So, you use strings to store data and perform manipulations on data (replace characters or whatnot) and you use symbols to name things (keys in a hash or something). Also see this answer for more use cases for symbol.

Ruby: what's the difference between initializing a string using quotes vs colon?

"bobo" is a string whereas :bobo is a symbol.

:bobo.class # => Symbol
"bobo".class # => String

String#==

If obj is not a String, returns false. Otherwise, returns true if str <=> obj returns zero.

So according to the documentation "bobo" == :bobo # => false and "bobo" == "bobo" # => true. - This is expected.

puts :bobo 
# >> bobo
:bobo.to_s # => "bobo"

This is because puts applying to_s on the Symbol object :bobo. See Symbol#to_s

What is the difference between a symbol and a variable in Ruby?

A symbol is an "internalized" string, it's more like a constant than anything. Typical example:

account_details = {
:name => 'Bob',
:age => 20
}

Here the symbols :name and :age are keys for a hash. They are not to be confused with variables. account_details is a variable.

A variable in Ruby is a handle to an object of some sort, and that object may be a symbol.

Normally you employ symbols when using strings would result in a lot of repetition. Keep in mind that strings are generally distinct objects where a distinct symbol always refers to the same object, making them more efficient if used frequently.

Compare:

"string".object_id == "string".object_id
# => false

:string.object_id == :string.object_id
# => true

Even though those two strings are identical, they're independent string objects. When used as keys for hashes, arguments to methods, and other common cases, these objects will quickly clutter up your memory with massive amounts of duplication unless you go out of your way to use the same string instance. Symbols do this for you automatically.

What is the use of symbols?

Ok, so the misunderstanding probably stems from this:

A symbol is not a variable, it is a value. like 9 is a value that is a number.

A symbol is a value that is kinda of roughly a string... it's just not a string that you can change... and because you can't change it, we can use a shortcut -> all symbols with the same name/value are stored in the same memory-spot (to save space).

You store the symbol into a variable, or use the value somewhere - eg as the key of a hash.... this last is probably one of the most common uses of a symbol.

you make a hash that contains key-value pairs eg:

thing_attrs = {:name => "My thing", :colour => "blue", :size => 6}
thing_attrs[:colour] # 'blue'

In this has - the symbols are the keys you can use any object as a key, but symbols are good to use as they use english words, and are thus easy to understand what you're storing/fetching... much better than, say numbers. Imagine you had:

thing_attrs = {0 => "My thing", 1 => "blue", 2 => 6}
thing_attrs[1] # => "blue"

It would be annoying and hard to remember that attribute 1 is the colour... it's much nicer to give names that you can read when you're reading the code. Thus we have two options: a string, or a symbol.

There would be very little difference between the two. A string is definitely usable eg:

thing_attrs = {"name" => "My thing", "colour" => "blue", "size" => 6}
thing_attrs["colour"] # 'blue'

except that as we know... symbols use less memory. Not a lot less, but enough less that in a large program, over time, you will notice it.
So it has become a ruby-standard to use symbols instead.

Why should I use a string and not a symbol when referencing object attributes?

Your instincts are right, IMHO.

Symbols are more appropriate than strings to represent the elements of an enumerated type because they are immutable. While it's true that they aren't garbage collected, unlike strings, there is always only one instance of any given symbol, so the impact is minimal for most state transition applications. And, while the performance difference is minimal as well for most applications, symbol comparison is much quicker than string comparison.

See also Enums in Ruby

Dynamic dispatch in Ruby: strings vs symbols

  1. There is no difference between them, in fact, when passing a string it is converted to a symbol.

  2. No need to convert it since that same conversion (e.g. 'bar'.to_sym) will be done if a string is provided.

From the docs:

Invokes the method identified by symbol, passing it any arguments
specified. Unlike send, #public_send calls public methods only. When
the method is identified by a string, the string is converted to a
symbol.



Related Topics



Leave a reply



Submit