Pure-Ruby concurrent Hash
Okay, now that you specified the actually meaning of 'threadsafe', here are two potential implementations. The following code will run forever in MRI and JRuby. The lockless implementation follows an eventual consistency model where each thread uses it's own view of the hash if the master is in flux. There is a little trickery required to make sure storing all the information in the thread doesn't leak memory, but that is handled and tested ― process size does not grow running this code. Both implementations would need more work to be 'complete', meaning delete, update, etc. would need some thinking, but either of the two concepts below will meet your requirements.
It's very important for people reading this thread to realize the whole issue is exclusive to JRuby ― in MRI the built-in Hash is sufficient.
module Cash
def Cash.new(*args, &block)
env = ENV['CASH_IMPL']
impl = env ? Cash.const_get(env) : LocklessImpl
klass = defined?(JRUBY_VERSION) ? impl : ::Hash
klass.new(*args)
end
class LocklessImpl
def initialize
@hash = {}
end
def thread_hash
thread = Thread.current
thread[:cash] ||= {}
hash = thread[:cash][thread_key]
if hash
hash
else
hash = thread[:cash][thread_key] = {}
ObjectSpace.define_finalizer(self){ thread[:cash].delete(thread_key) }
hash
end
end
def thread_key
[Thread.current.object_id, object_id]
end
def []=(key, val)
time = Time.now.to_f
tuple = [time, val]
@hash[key] = tuple
thread_hash[key] = tuple
val
end
def [](key)
# check the master value
#
val = @hash[key]
# someone else is either writing the key or it has never been set. we
# need to invalidate our own copy in either case
#
if val.nil?
thread_val = thread_hash.delete(key)
return(thread_val ? thread_val.last : nil)
end
# check our own thread local value
#
thread_val = thread_hash[key]
# in this case someone else has written a value that we have never seen so
# simply return it
#
if thread_val.nil?
return(val.last)
end
# in this case there is a master *and* a thread local value, if the master
# is newer juke our own cached copy
#
if val.first > thread_val.first
thread_hash.delete(key)
return val.last
else
return thread_val.last
end
end
end
class LockingImpl < ::Hash
require 'sync'
def initialize(*args, &block)
super
ensure
extend Sync_m
end
def sync(*args, &block)
sync_synchronize(*args, &block)
end
def [](key)
sync(:SH){ super }
end
def []=(key, val)
sync(:EX){ super }
end
end
end
if $0 == __FILE__
iteration = 0
loop do
n = 42
hash = Cash.new
threads =
Array.new(10) {
Thread.new do
Thread.current.abort_on_exception = true
n.times do |key|
hash[key] = key
raise "#{ key }=nil" if hash[key].nil?
end
end
}
threads.map{|thread| thread.join}
puts "THREADSAFE: #{ iteration += 1 }"
end
end
Ruby: In-memory Hash-based and thread-safe buffer?
In general, the way that you provide access to a global value in a thread-safe manner is to use the built-in Mutex class:
$buffer = {}
$bufflock = Mutex.new
threads = (0..2).map do |i|
Thread.new do
puts "Starting Thread #{i}"
3.times do
puts "Thread #{i} got: #{$buffer[:foo].inspect}"
$bufflock.synchronize{ $buffer[:foo] = ($buffer[:foo] || 1) * (i+1) }
sleep rand
end
puts "Ending Thread #{i}"
end
end
threads.each{ |t| t.join } # Wait for all threads to complete
#=> Starting Thread 0
#=> Thread 0 got: nil
#=> Starting Thread 1
#=> Thread 1 got: 1
#=> Starting Thread 2
#=> Thread 2 got: 2
#=> Thread 1 got: 6
#=> Thread 1 got: 12
#=> Ending Thread 1
#=> Thread 0 got: 24
#=> Thread 2 got: 24
#=> Thread 0 got: 72
#=> Thread 2 got: 72
#=> Ending Thread 0
#=> Ending Thread 2
Code inside a Mutex#synchronize
block is atomic per thread; one thread cannot go into $bufflock
until the previous thread is done with the block.
See also: Pure-Ruby concurrent Hash
Fast Thread-Safe Ruby Hash with strong read bias
I would suggest a wrapper which protects the Hash
with a read-write lock. I couldn't find a pre-built Ruby read-write lock implementation (of course JRuby users can use java.util.concurrent.ReentrantReadWriteLock), so I built one. You can see it at:
https://github.com/alexdowad/showcase/blob/master/ruby-threads/read_write_lock.rb
Me and two other people have tested it on MRI 1.9.2, MRI 1.9.3, and JRuby. It seems to be working correctly (though I still want to do more thorough testing). It has a built-in test script; if you have a multi-core machine, please download, try running it, and let me know the results! As far as performance goes, it trounces Mutex in situations with a read bias. Even in situations with 80-90% writes, it still seems a bit faster than using a Mutex.
I am also planning to do a Ruby port of Java's ConcurrentHashMap.
Necessity of the locks while working with concurrent hash map
Let's look at the operations inside public void put(String some)
.
map.put(counter++, tmp);
sum.getAndAdd(tmp);
Now let's look at the individual parts.
counter
is a volatile variable. So it only provides memory visibility but not atomicity. Sincecounter++
is a compound operation, you need a lock to achieve atomicity.map.put(key, value)
is atomic since it is aConcurrentHashMap
.sum.getAndAdd(tmp)
is atomic since it is aAtomicInteger
.
As you can see, except counter++
every other operation is atomic. However, you are trying to achieve some function by combining all these operations. To achieve atomicity at the functionality level, you need a lock. This will help you to avoid surprising side effects when the threads interleave between the individual atomic operations.
So you need a lock because counter++
is not atomic and you want to combine a few atomic operations to achieve some functionality (assuming you want this to be atomic).
Is `hash` on a number consistent within a single ruby process?
Yes, it is guaranteed to be the same within one Ruby process. Hash
relies on this fact to guarantee unique keys. Otherwise long-lived processes could end up with duplicate hash keys when an object gets a new hash
value.
Regarding implementation and memory locations, the implementation of hash
is not guaranteed by Ruby. The only requirement is:
This function must have the property that
a.eql?(b)
impliesa.hash ==
.
b.hash
In YARV, for example, many objects do not exist in memory at all (e.g. Fixnums) and are simply identified by a special object ID. x.object_id == x * 2 + 1
for any fixnum.
Document Server: Handling Concurrent Saves
My suggestion would be something like your first one. When the first user (Bob) opens the document, he acquires a lock so that other users can only read the current document. If the user saves the document while he is using it, he keeps the lock. Only when he exits the document, it is unlocked and other people can edit it.
If the second user (Kate) opens the document while Bob has the lock on it, Kate will get a message saying the document is uneditable but she can read it until it the lock has been released.
So what happens when Bob acquires the lock, maybe saves the document once or twice but then exits the application leaving the lock hanging?
As you said yourself, requiring the client with the lock to send pings at a certain frequency is probably the best option. If you don't get a ping from the client for a set amount of time, this effectively means his client is not responding anymore. If this is a web application you can use javascript for the pings. The document that was last saved releases its lock and Kate can now acquire it.
A ping can contain the name of the document that the client has a lock on, and the server can calculate when the last ping for that document was received.
Related Topics
Checking If a Variable Is Not Nil and Not Zero in Ruby
How to Capitalize the First Letter in a String in Ruby
Disable a Group of Tests in Rspec
What Should Be Removed from Public Source Control in Ruby on Rails
Copying Gems from Previous Version of Ruby in Rbenv
How to Describe an Enumeration Column in a Rails 3 Migration
Implementation of "Remember Me" in a Rails Application
Check If Two Arrays Have the Same Contents (In Any Order)
Deploying Sinatra App (With Config.Ru) on Heroku Cedar Stack
Find File Path of a Given Class
Rails: Convert Utc Datetime to Another Time Zone
All Ruby Tests Raising: Undefined Method 'Authenticate' for Nil:Nilclass
Ruby Convert Array into Function Arguments
Protecting the Content of Public/ in a Rails App
Count the Number of Lines in a File Without Reading Entire File into Memory
Force Bundle Install to Use Https:// Instead of Git:// for Github-Based Gems