Are Strings Mutable in Ruby

Are strings in Ruby mutable?

Yes, strings in Ruby, unlike in Python, are mutable.

s += "hello" is not appending "hello" to s - an entirely new string object gets created. To append to a string 'in place', use <<, like in:

s = "hello"
s << " world"
s # hello world

How do I create a mutable string in Ruby?

You add a + before the string like:

string = +'hello'

string << ' world'

puts(string)

hello world

Why did Matz choose to make Strings mutable by default in Ruby?

This is in line with Ruby's design, as you note. Immutable strings are more efficient than mutable strings - less copying, as strings are re-used - but make work harder for the programmer. It is intuitive to see strings as mutable - you can concatenate them together. To deal with this, Java silently translates concatenation (via +) of two strings into the use of a StringBuffer object, and I'm sure there are other such hacks. Ruby chooses instead to make strings mutable by default at the expense of performance.

Ruby also has a number of destructive methods such as String#upcase! that rely on strings being mutable.

Another possible reason is that Ruby is inspired by Perl, and Perl happens to use mutable strings.

Ruby has Symbols and frozen Strings, both are immutable. As an added bonus, symbols are guaranteed to be unique per possible string value.

Are strings mutable in Ruby?

Yes, << mutates the same object, and + creates a new one. Demonstration:

irb(main):011:0> str = "hello"
=> "hello"
irb(main):012:0> str.object_id
=> 22269036
irb(main):013:0> str << " world"
=> "hello world"
irb(main):014:0> str.object_id
=> 22269036
irb(main):015:0> str = str + " world"
=> "hello world world"
irb(main):016:0> str.object_id
=> 21462360
irb(main):017:0>

What's the difference between String.new and a string literal in Ruby?

== checks for equal content.

equal? checks for equal identity.

a = "hello"
b = "hello"

a == b # => true
a.equal?(b) # => false

In Ruby string literals are not immutable and thus creating a string and using a literal are indeed the same. In both cases Ruby creates a new string instance each time the expressions in evaluated.

Both of these are thus the same

10.times { String.new }
# is the same as
10.times { "" }

Let's verify this

10.times { puts "".object_id }

Prints 10 different numbers

70227981403600
70227981403520
70227981403460
...

Why? Strings are by default mutable and thus Ruby has to create a copy each time a string literal is reached in the source code. Even if those literals are usually rarely modified in practice.

Thus a Ruby program typically creates an excessive amount short-lived string objects, which puts a huge strain on garbage collection. It is not unusual that a Rails app creates 500,000 short-lived strings just to serve one request and this is one of the main performance bottlenecks of scaling Rails to millions or even 100 millions of users.

To address that Ruby 2.3 introduced frozen string literals, where all string literals default to being immutable. Since this is not backwards compatible it is opt-in with a pragma

# frozen_string_literal: true

Let's verify this too

# frozen_string_literal: true
10.times { puts "".object_id }

Prints the same number 10 times

69898321746880
69898321746880
69898321746880
...

Fun fact, setting a key in a hash also creates a copy of a string

str = "key"
hash = {}
hash[str] = true
puts str.object_id
puts hash.keys.first.object_id

Prints two different numbers

70243164028580
70243132639660

How can I describe mutable strings when strings are immutable by default?

I had missed it. The recommended way is to use the +@ method string literal.

(+"foo").frozen? # => false
(-"foo").frozen? # => true
"foo".frozen? # => true

Ruby: How does concatenation effect the String in memory?

How come concatenating to a string does not change its object_id?

Because it's still the same string it was before.

My understand was that Strings are immutable

No, they are not immutable. In Ruby, strings are mutable.

because Strings are essentally Arrays of Characters,

They are not. In Ruby, strings are mostly a factory for iterators (each_line, each_char, each_codepoint, each_byte). It implements a subset of the Array protocol, but that does not mean that it is an array.

and Arrays cannot be changed in memory since they are contiguous.

Wrong, arrays are mutable in Ruby.

Yet, as demonstrated below: Instantiating a String than adding characters does not change it's object_id. How does concatenation effect the String in memory?

The Ruby Language Specification does not prescribe any particular in-memory representation of strings. Any representation is fine, as long as it supports the semantics specified in the Ruby Language Specification.

Here's a couple of examples from some Ruby implementations:

  • Rubinius:
    • kernel/common/string.rb
    • kernel/bootstrap/string.rb
    • vm/builtin/string.cpp
  • Topaz:
    • topaz/objects/stringobject.py
  • Cardinal:
    • src/classes/String.pir
  • IronRuby:
    • Ruby/Builtins/MutableString.cs
  • JRuby:
    • core/src/main/java/org/jruby/RubyString.java

What's the difference between a string and a symbol in Ruby?

The main difference is that multiple symbols representing a single value are identical whereas this is not true with strings. For example:

irb(main):007:0> :test.object_id
=> 83618
irb(main):008:0> :test.object_id
=> 83618
irb(main):009:0> :test.object_id
=> 83618

Those are three references to the symbol :test, which are all the same object.

irb(main):010:0> "test".object_id
=> -605770378
irb(main):011:0> "test".object_id
=> -605779298
irb(main):012:0> "test".object_id
=> -605784948

Those are three references to the string "test", but are all different objects.

This means that using symbols can potentially save a good bit of memory depending on the application. It is also faster to compare symbols for equality since they are the same object, comparing identical strings is much slower since the string values need to be compared instead of just the object ids.

As far as when to use which, I usually use strings for almost everything except things like hash keys where I really want a unique identifier, not a string.



Related Topics



Leave a reply



Submit