Thread Safety: Class Variables in Ruby

Thread Safety: Class Variables in Ruby

Instance variables are not thread safe (and class variables are even less thread safe)

Example 2 and 3, both with instance variables, are equivalent, and they are NOT thread safe, like @VincentXie stated. However, here is a better example to demonstrate why they are not:

class Foo
def self.bar(message)
@bar ||= message
end
end

t1 = Thread.new do
puts "bar is #{Foo.bar('thread1')}"
end

t2 = Thread.new do
puts "bar is #{Foo.bar('thread2')}"
end

sleep 2

t1.join
t2.join

=> bar is thread1
=> bar is thread1

Because the instance variable is shared amongst all of the threads, like @VincentXie stated in his comment.

PS: Instance variables are sometimes referred to as "class instance variables", depending on the context in which they are used:

When self is a class, they are instance variables of classes(class
instance variables). When self is a object, they are instance
variables of objects(instance variables). - WindorC's answer to a question about this

Ruby on Rails Thread Safe Class Variable for Webservice access

Using a class variable in this manner is not threadsafe. You could use a thread local (ie Thread.current[:some_key]) , but in general this feels like a design smell to me.

Ruby why are class instance variables threadsafe

Since the tenancy attribute's scope is a request, I would suggest you keep it in the scope of the current thread. Since a request is handled on a single thread, and a thread handles a single request at a time - as long as you always set the tenancy at the beginning of the request you will be fine (for extra security, you might want to un-assign the tenant at the end of the request).

To do this you can use thread local attributes:

class Tenant < ActiveRecord::Base

def self.current_tenant=(tenant)
Thread.current[:current_tenant] = tenant
end

def self.current_tenant
Thread.current[:current_tenant]
end

def self.clear_current_tenant
Thread.current[:current_tenant] = nil
end
end

Since this is using a thread store, you are totally thread safe - each thread is responsible for its own data.

Rails class method is thread safe?

This in an instance method, not to be confused with a class method. The answers method is on an instance of User, as opposed to being on the User class itself. This method is caching the answers on the instance of a User, but as long as this User instance is being instantiated with each web request (such as a User.find()or User.find_by()), you’re fine because the instance is not living between threads. It’s common practice to look records up every web request in the controller, so you’re likely doing that.

If this method was on the User class directly (such as User.answers), then you’d need to evaluate whether it’s safe for that cached value to be maintained across threads and web requests.

To recap, the your only concern for thread safety is class methods, class variables (instance variables that use two at signs such as @@answers), and instance methods where the instance lives on past a single web request.

If you ever find yourself needing to use a class-level variable safely, you can use Thread.current, which is essentially a per-thread Hash (like {}) that you can store values in. For example Thread.current[:foo] = 1 would be an example. ActiveSupport uses this when setting Time.zone.

Alternatively you may find times where you need a single array that you need to safely share across threads, in which case you’d need to look into Mutex, which basically lets you have an array that you lock and unlock to give threads safe access to reading and writing in it. The Sidekiq gem uses a Mutex to manage workers, for example. You lock the Mutex, so that no one else can change it, then you write to it, and then unlock it. It’s important to note that if any other thread wants to write to the Mutex while it’s locked, it’ll have to wait for it to become unlocked (like, the thread just pauses while the other thread writes), so it’s important to lock as short as possible.

Are Ruby class methods thread-safe?

The local variables, such as your hash, are local to the particular invocation of the surrounding method. If two threads end up calling perform at the same time, then each call will get its own execution context and those won't overlap unless there are shared resources involved: instance variables (@hash), class variables (@@hash), globals ($hash), ... can cause concurrency problems. There's nothing to worry about thread-wise with something simple like your perform.

However, if perform was creating threads and you ended up with closures inside perform, then you could end up with several threads referencing the same local variables captured through the closures. So you do have to be careful about scope issues when you create threads but you don't have to worry about it when dealing with simple methods that only work with local variables.

Ruby thread-safe class variables

I see two options. The first is to make @strings an instance variable.

But then you may be loading them more than once on a single request, so instead, you can turn @strings into a hash of locale against a string set.

module HasStrings
def self.included(klass)
klass.extend ClassMethods
end

module ClassMethods
def strings
@strings ||= {}
string_locale = File.exists?(locale_filename(I18n.locale.to_s)) ? I18n.locale.to_s : 'en'
@strings[string_locale] ||= File.open(locale_filename(string_locale)) { |f| YAML.load(f) }
end

def locale_filename(locale)
"#{RAILS_ROOT}/config/locales/#{self.name.pluralize.downcase}/#{locale}.yml"
end
end
end

how to know what is NOT thread-safe in ruby?

None of the core data structures are thread safe. The only one I know of that ships with Ruby is the queue implementation in the standard library (require 'thread'; q = Queue.new).

MRI's GIL does not save us from thread safety issues. It only makes sure that two threads cannot run Ruby code at the same time, i.e. on two different CPUs at the exact same time. Threads can still be paused and resumed at any point in your code. If you write code like @n = 0; 3.times { Thread.start { 100.times { @n += 1 } } } e.g. mutating a shared variable from multiple threads, the value of the shared variable afterwards is not deterministic. The GIL is more or less a simulation of a single core system, it does not change the fundamental issues of writing correct concurrent programs.

Even if MRI had been single-threaded like Node.js you would still have to think about concurrency. The example with the incremented variable would work fine, but you can still get race conditions where things happen in non-deterministic order and one callback clobbers the result of another. Single threaded asynchronous systems are easier to reason about, but they are not free from concurrency issues. Just think of an application with multiple users: if two users hit edit on a Stack Overflow post at more or less the same time, spend some time editing the post and then hit save, whose changes will be seen by a third user later when they read that same post?

In Ruby, as in most other concurrent runtimes, anything that is more than one operation is not thread safe. @n += 1 is not thread safe, because it is multiple operations. @n = 1 is thread safe because it is one operation (it's lots of operations under the hood, and I would probably get into trouble if I tried to describe why it's "thread safe" in detail, but in the end you will not get inconsistent results from assignments). @n ||= 1, is not and no other shorthand operation + assignment is either. One mistake I've made many times is writing return unless @started; @started = true, which is not thread safe at all.

I don't know of any authoritative list of thread safe and non-thread safe statements for Ruby, but there is a simple rule of thumb: if an expression only does one (side-effect free) operation it is probably thread safe. For example: a + b is ok, a = b is also ok, and a.foo(b) is ok, if the method foo is side-effect free (since just about anything in Ruby is a method call, even assignment in many cases, this goes for the other examples too). Side-effects in this context means things that change state. def foo(x); @x = x; end is not side-effect free.

One of the hardest things about writing thread safe code in Ruby is that all core data structures, including array, hash and string, are mutable. It's very easy to accidentally leak a piece of your state, and when that piece is mutable things can get really screwed up. Consider the following code:

class Thing
attr_reader :stuff

def initialize(initial_stuff)
@stuff = initial_stuff
@state_lock = Mutex.new
end

def add(item)
@state_lock.synchronize do
@stuff << item
end
end
end

A instance of this class can be shared between threads and they can safely add things to it, but there's a concurrency bug (it's not the only one): the internal state of the object leaks through the stuff accessor. Besides being problematic from the encapsulation perspective, it also opens up a can of concurrency worms. Maybe someone takes that array and passes it on to somewhere else, and that code in turn thinks it now owns that array and can do whatever it wants with it.

Another classic Ruby example is this:

STANDARD_OPTIONS = {:color => 'red', :count => 10}

def find_stuff
@some_service.load_things('stuff', STANDARD_OPTIONS)
end

find_stuff works fine the first time it's used, but returns something else the second time. Why? The load_things method happens to think it owns the options hash passed to it, and does color = options.delete(:color). Now the STANDARD_OPTIONS constant doesn't have the same value anymore. Constants are only constant in what they reference, they do not guarantee the constancy of the data structures they refer to. Just think what would happen if this code was run concurrently.

If you avoid shared mutable state (e.g. instance variables in objects accessed by multiple threads, data structures like hashes and arrays accessed by multiple threads) thread safety isn't so hard. Try to minimize the parts of your application that are accessed concurrently, and focus your efforts there. IIRC, in a Rails application, a new controller object is created for every request, so it is only going to get used by a single thread, and the same goes for any model objects you create from that controller. However, Rails also encourages the use of global variables (User.find(...) uses the global variable User, you may think of it as only a class, and it is a class, but it is also a namespace for global variables), some of these are safe because they are read only, but sometimes you save things in these global variables because it is convenient. Be very careful when you use anything that is globally accessible.

It's been possible to run Rails in threaded environments for quite a while now, so without being a Rails expert I would still go so far as to say that you don't have to worry about thread safety when it comes to Rails itself. You can still create Rails applications that aren't thread safe by doing some of the things I mention above. When it comes other gems assume that they are not thread safe unless they say that they are, and if they say that they are assume that they are not, and look through their code (but just because you see that they go things like @n ||= 1 does not mean that they are not thread safe, that's a perfectly legitimate thing to do in the right context -- you should instead look for things like mutable state in global variables, how it handles mutable objects passed to its methods, and especially how it handles options hashes).

Finally, being thread unsafe is a transitive property. Anything that uses something that is not thread safe is itself not thread safe.

Are local variables in global Ruby variable thread-safe?

If popFromGlobalArray() functions correctly in a multithreaded environment and is guaranteed not to return the same object more than once, and the implementation of the proxy class does not share state between instances, the rest of the function should be fine. You aren't operating on the same data on different threads, so they can't conflict.

If you're worried about the variables themselves, you needn't be. Locals are defined per method invocation, and different threads will be running different invocations of the method. They don't share locals.

Obviously the specifics can make this less true, but this is how it generally works.



Related Topics



Leave a reply



Submit