Ruby Differences Between += and << to Concatenate a String

Ruby differences between += and to concatenate a string

The shovel operator << performs much better than += when dealing with long strings because the shovel operator is allowed to modify the original string, whereas += has to copy all the text from the first string into a new string every time it runs.

There is no += operator defined on the String class, because += is a combined operator. In short x += "asdf" is exactly equivalent to x = x + "asdf", so you should reference the + operator on the string class, not look for a += operator.

String concatenation in Ruby

You can do that in several ways:

  1. As you shown with << but that is not the usual way
  2. With string interpolation

    source = "#{ROOT_DIR}/#{project}/App.config"
  3. with +

    source = "#{ROOT_DIR}/" + project + "/App.config"

The second method seems to be more efficient in term of memory/speed from what I've seen (not measured though). All three methods will throw an uninitialized constant error when ROOT_DIR is nil.

When dealing with pathnames, you may want to use File.join to avoid messing up with pathname separator.

In the end, it is a matter of taste.

What is the difference between #concat and += on Arrays?

+= would create a new array object, concat mutates the original object

a = [1,2]
a.object_id # => 19388760
a += [1]
a.object_id # => 18971360


b = [1,2]
b.object_id # => 18937180
b.concat [1]
b.object_id # => 18937180

Note the object_id for a changed while for b did not change

String concatenation vs. interpolation in Ruby

Whenever TIMTOWTDI (there is more than one way to do it), you should look for the pros and cons. Using "string interpolation" (the second) instead of "string concatenation" (the first):

Pros:

  • Is less typing
  • Automatically calls to_s for you
  • More idiomatic within the Ruby community
  • Faster to accomplish during runtime

Cons:

  • Automatically calls to_s for you (maybe you thought you had a string, and the to_s representation is not what you wanted, and hides the fact that it wasn't a string)
  • Requires you to use " to delimit your string instead of ' (perhaps you have a habit of using ', or you previously typed a string using that and only later needed to use string interpolation)

Use \ instead of + or to concatenate those strings

In Ruby, literal strings are allocated as objects in memory when they are encountered. If you concatenate two string literals, as in

str = "foo" + "bar"

you will actually allocate three String objects: "foo", "bar" and the result of the concatenation (which is then referred to by str).

The same happens if you do:

"foo" << "bar"

In many cases, this is only a mild inefficiency which you should not worry too much about.

However, be aware that if you do that in a loop, and if the aggregate String grows large, you will allocate an ever larger String object at every iteration (you can avoid that by collecting the string parts in an Array, and then calling join when you are done; also, foo << 'bar' will modify foo in place, which is acceptable in a loop).

By using \ you do not concatenate intermediate objects, but instead effectively present the parser with one String literal (because \ simply continues the string on the next line).

The "foo" + "bar" + "baz" idiom is frequently used in Java, where strings are immutable, and literals are concatenated at compile time. (Every string literal in Java is only stored once and then reused throughout the code; Ruby does not do that.) Also, Java has no better way to continue strings over multiple lines, which is why Java programmers do it that way.

Can you append to two strings in the same line in ruby?

Well, in your example any of the following would work:

Assignment (similar to your first line):

string1 = string2 += " substring2"

Concatenate one string, and both are updated because they are the same object:

string1 << " substring2"

OR

string2 << " substring2"

All those solutions though rely on the fact that string1 and string2 are identical in your example. If string1 and string2 are actually different strings and you want to append to both of them, you could try this:

[string1, string2].each{|str| str << " substring2"}

Trying to understand the shovel operator with strings

When you set hi = original_string your hi variable is just a new variable pointed at the same object. If you look at hi.object_id and original_string.object_id you will find they are the same. If you want a clone of an object that you can manipulate without impacting the
original_string, you'll need to say something like hi = original_string.clone or hi = original_string.dup.

ruby operator confusion with shovel ( ) and += , Concating arrays

One difference is that because << works in place it is somewhat faster than +=. The following code

require 'benchmark'

a = ''
b= ''

puts Benchmark.measure {
100000.times { a << 'test' }
}

puts Benchmark.measure {
100000.times { b += 'test' }
}

yields

0.000000   0.000000   0.000000 (  0.004653)
0.060000 0.060000 0.120000 ( 0.108534)

Update

I originally misunderstood the question. Here's whats going on. Ruby variables only store references to objects, not the objects themselves. Here's simplified code that does the same thing as yours, and has the same issue. I've told it to print temp and words_array on each iteration of the loops.

def helper(word)
words_array = []

word.length.times do |i|
temp = ''
(i...word.length).each do |j|
temp << word[j]
puts "temp:\t#{temp}"
words_array << temp unless words_array.include?(temp)
puts "words:\t#{words_array}"
end
end

words_array
end

p helper("cat")

Here is what it prints:

temp:   c
words: ["c"]
temp: ca
words: ["ca"]
temp: cat
words: ["cat"]
temp: a
words: ["cat", "a"]
temp: at
words: ["cat", "at"]
temp: t
words: ["cat", "at", "t"]
["cat", "at", "t"]

As you can see, during each iteration of the inner loop after the first, ruby is simply replacing the last element of words_array. That is because words_array holds a reference to the string object referenced by temp, and << modifies that object in place rather than creating a new object.

On each iteration of the outer loop temp is set to a new object, and that new object is appended to words_array, so it doesn't replace the previous elements.

The += construct returns a new object to temp on each iteration of the inner loop, which is why it does behave as expected.

Difference between += for Integers/Strings and For Arrays?

You see, names (variable names, like a and b) don't hold any values themselves. They simply point to a value. When you make an assignment

a = 5

then a now points to value 5, regardless of what it pointed to previously. This is important.

a = 'abcd'
b = a

Here both a and b point to the same string. But, when you do this

a += 'e'

It's actually translated to

a = a + 'e'
# a = 'abcd' + 'e'

So, name a is now bound to a new value, while b keeps pointing to "abcd".

a = [1,2,3,4]
b = a
a << 5

There's no assignment here, method << modifies existing array without replacing it. Because there's no replacement, both a and b still point to the same array and one can see the changes made to another.

Ruby: Can I write multi-line string with no concatenation?

There are pieces to this answer that helped me get what I needed (easy multi-line concatenation WITHOUT extra whitespace), but since none of the actual answers had it, I'm compiling them here:

str = 'this is a multi-line string'\
' using implicit concatenation'\
' to prevent spare \n\'s'

=> "this is a multi-line string using implicit concatenation to eliminate spare
\\n's"

As a bonus, here's a version using funny HEREDOC syntax (via this link):

p <<END_SQL.gsub(/\s+/, " ").strip
SELECT * FROM users
ORDER BY users.id DESC
END_SQL
# >> "SELECT * FROM users ORDER BY users.id DESC"

The latter would mostly be for situations that required more flexibility in the processing. I personally don't like it, it puts the processing in a weird place w.r.t. the string (i.e., in front of it, but using instance methods that usually come afterward), but it's there. Note that if you are indenting the last END_SQL identifier (which is common, since this is probably inside a function or module), you will need to use the hyphenated syntax (that is, p <<-END_SQL instead of p <<END_SQL). Otherwise, the indenting whitespace causes the identifier to be interpreted as a continuation of the string.

This doesn't save much typing, but it looks nicer than using + signs, to me.

Also (I say in an edit, several years later), if you're using Ruby 2.3+, the operator <<~ is also available, which removes extra indentation from the final string. You should be able to remove the .gsub invocation, in that case (although it might depend on both the starting indentation and your final needs).

EDIT: Adding one more:

p %{
SELECT * FROM users
ORDER BY users.id DESC
}.gsub(/\s+/, " ").strip
# >> "SELECT * FROM users ORDER BY users.id DESC"


Related Topics



Leave a reply



Submit