Why Is the << Operation on an Array in Ruby Not Atomic

Use of ||= and += in JRuby (jruby-lint warnings)

In jruby compound expressions like ||= are not atomically executed. When you write:

foo ||= 'bar'

What is actually executed internally is something like:

1. unless foo
2. foo = 'bar'
3. end

Because line 1 and line 2 are evaluated separately it is possible in a multi-threaded app that the state could be changed by a different thread in between these two, something like:

thread 1: foo ||= 'bar'
thread 2: foo ||= 'baz'

Which executes like:

# foo has not been set yet
1. thread 1: unless foo
2. thread 2: unless foo
3. thread 1: foo = 'bar'
4. thread 2: foo = 'baz'
# ...

Note that foo will end up getting re-assigned to 'baz' by the second thread even though it already has a value. Using += is equally problematic because this:

thread 1: x += 1
thread 2: x += 1

Will be executed like this:

# x starting value of 0
1. thread1: tempvar = x + 1 # 1
2. thread2: tempvar = x + 1 # 1
3. thread1: x = tempvar # 1
4. thread2: x = tempvar # 1

So x should be 2 after the two operations but is in fact only incremented once.

If you are running a single-threaded app/script in jruby none of this is an issue. If you are going to be running with multiple threads then these operations are not safe to use within the executing environment of those threads if they are used on variables which are accessed by more than one thread.

In environments where thread safety matters you can fix this by wrapping the operation in a mutex or using a primitive which ensures thread safety for the specific operation.

Also the atomic gem can be used with jRuby to ensure atomicity of check & update operations. I'm not sure if it supports arrays though.

More information about managing concurrency in jRuby.

Hope this helps!

Is assignment an atomic operation in ruby MRI?

This depends a bit on the Ruby implementation you are using. As for MRI Ruby (the "default" Ruby), this is a safe (atomic) operation due to its Global Interpreter Lock which guards some operations such as assignments from bein interrupted by context switches.

JRuby also guarantees that some operations are thread-safe, including assignment to instance variables.

In any case, please make sure to take into account that any such concurrent access can be serialized in a seemingly random way. That is, you can't guarantee which threads assigns first and which one last unless you use explicit locks such as a Mutex.

Pushing to an array not working as expected

It seems there are some strange behavior in row objects wich seems to be some kind of singleton, and that's why dup method wont solve it.

Jumping into the source code it seems that the to_a method will duplicate the inner row elements and that's why it works so the answer is to use to_a on the row object or if you want you can also transform it into a Hash to preserve meta.

while row=sth.fetch do
tasks.push(row.to_a)
end

But I recommend the more ruby way

sth.fetch do |row|
tasks << row.to_a
end

What does ||= (or-equals) mean in Ruby?

This question has been discussed so often on the Ruby mailing-lists and Ruby blogs that there are now even threads on the Ruby mailing-list whose only purpose is to collect links to all the other threads on the Ruby mailing-list that discuss this issue.

Here's one: The definitive list of ||= (OR Equal) threads and pages

If you really want to know what is going on, take a look at Section 11.4.2.3 "Abbreviated assignments" of the Ruby Language Draft Specification.

As a first approximation,

a ||= b

is equivalent to

a || a = b

and not equivalent to

a = a || b

However, that is only a first approximation, especially if a is undefined. The semantics also differ depending on whether it is a simple variable assignment, a method assignment or an indexing assignment:

a    ||= b
a.c ||= b
a[c] ||= b

are all treated differently.

save an active records array

a.each(&:save)

This will call B#save on each item in the array.

Why isn't there a String#shift()?

Strings don't act as an enumerable object as of 1.9, because it's considered too confusing to decide what it'd be a list of:

  • A list of characters / codepoints?
  • A list of bytes?
  • A list of lines?

ruby atomic operations in multithreaded environment

You can look in the c code (array.c)
if it calls any ruby method calls (rb_funcall) then it's not thread safe, I believe. Otherwise it should be...

You could easily override #pop et al and make them have their own synchronization.

What is the right way to iterate through an array in Ruby?

This will iterate through all the elements:

array = [1, 2, 3, 4, 5, 6]
array.each { |x| puts x }

# Output:

1
2
3
4
5
6

This will iterate through all the elements giving you the value and the index:

array = ["A", "B", "C"]
array.each_with_index {|val, index| puts "#{val} => #{index}" }

# Output:

A => 0
B => 1
C => 2

I'm not quite sure from your question which one you are looking for.

String concatenation in Ruby

You can do that in several ways:

  1. As you shown with << but that is not the usual way
  2. With string interpolation

    source = "#{ROOT_DIR}/#{project}/App.config"
  3. with +

    source = "#{ROOT_DIR}/" + project + "/App.config"

The second method seems to be more efficient in term of memory/speed from what I've seen (not measured though). All three methods will throw an uninitialized constant error when ROOT_DIR is nil.

When dealing with pathnames, you may want to use File.join to avoid messing up with pathname separator.

In the end, it is a matter of taste.



Related Topics



Leave a reply



Submit