Ruby Multiple Background Threads

How to run a background thread in ruby?

Read the documentation for Thread.new (which is the same as Thread.start here)

Thread.start(commands) runs the commands method and passes its return value to a thread (which then does nothing). It's blocking because you aren't starting any threads when gets is called. You want

Thread.start { commands }

Here's a similar demo script that works just like you would expect

def commands
  while gets.strip !~ /^exit$/i
    puts "Invalid command"
  end
  abort "Exiting the program"
end

Thread.start { commands }

loop do
  puts "Type exit:"
  sleep 2
end

Multithreading vs Background jobs in Rails

Sounds like you need a thread pool for performing the operation, and a database thread to commit the results.

You can build one of these really simply:

require 'thread'

db_queue = Queue.new

Thread.new do
  while (item = db_queue.pop)
    # ... Deal with item in queue
  end
end

# Example of supplying a job

db_queue.push(api_response)

# When finished
db_queue.push(nil)

Due to the Global Interpreter Lock in the standard Ruby runtime threads are only really useful for managing many lightly loaded threads. If you need something more heavy-duty, JRuby might be what you're looking for.

Rails best practice: background process/thread?

Threading for background processes in ruby would be a terrible mistake, especially since you're using a multi-process server. Using unicorn with say 4 worker processes would mean that you'd be polling from each of them, which is not what you want. Ruby doesn't really have real threads, it has green threads in 1.8 and a global interpreter lock in 1.9 IIRC. Many gems and libraries are also obnoxiously unthreadsafe.

Using memcache is still your best option and, if you have it set up correctly, you should only see it adding a millisecond or two to the request time. Another option which would give you the benefit of persisting these alerts while incurring minimal additional overhead would be to store these alerts in redis. This would better protect you against things like memcache crashing or server reboots.

For the background jobs you should use a similar approach to what you have now, but there are several off the shelf handlers for this like resque, delayed_job, and a few others. If you absolutely have to use SQS as the backend queue, you might be able to find some code to help you, but otherwise you could write it yourself. This still requires the other daemon to be rebooted whenever there is a code change. In practice this isn't a huge concern as best practices dictate using a deployment system like capistrano where a rule can easily be added to bounce the daemon on deploy. I use monit to watch the daemon process, so restarting it is as easy as telling monit to restart it.

In general, Ruby is not like Java/Objective-C when it comes to threads. It follows the more Unix-like model of process based isolation, but the community has come up with best practices and ways to make this less painful than in other languages. Ruby does require a bit more attention to setting up its stack as it is not as simple as enabling mod_php and copying some files around, but once the choices and architecture is understood, it is easier to reason about how your application works. The process model, in my opinion, is much better for web apps as it isolates code and state from the effects of other running operations. The isolation also makes the app easier to work with in a distributed system.

Ruby threading/forking with API (Sinatra)

I believe that better way to do it - is to use background jobs. While your worker executes some long-running tasks, it is unavailable for new requests. With background jobs - they do the work, while your web-worker can work with new request.

You can have a look at most popular backgroung jobs gems for ruby as a starting point: resque, delayed_jobs, sidekiq

UPD: Implementation depends on chosen gem, but general scheme will be like this:

# Controller
post '/items' do
  # Processing data
  MyAwesomeJob.enqueue # here you put your job into queue
  head :ok # or whatever
end

In MyAwesomejob you implement your long-runnning task

Next, about Mongoid and background jobs. You should never use complex objects as job arguments. I don't know what kind of task you are implementing, but there is general answer - use simple objects.

For example, instead of using your User as argument, use user_id and then find it inside your job. If you will do it like that, you can use any DB without problems.

Multi-threading ruby workers on heroku

You might try sidekiq, which works similar to resque, but uses threads to process tasks concurrently. You can also use resque and sidekiq together.

Does the main thread 'always' run in a ruby web server, like Sinatra?

The following code works fine for me - tested on OS X local machine I was able to get 1500+ real threads running with thin and ruby 1.9.2. On Heroku cedar stack, I can get about 230 threads running before I get an error when creating a thread.

In both cases all the threads seem to finish when they are supposed to - 2 minutes after launching them. '/' is rendered in about 60 ms on Heroku, and then the 20 threads run for 2 minutes each.

If you refresh / a few times, then wait a few minutes, you can see the threads finishing. The reason I tested for 2 minutes was that heroku has a 30 second limit on responses, cutting you off if you take more than that amount of time. But this does not seem to effect background threads.

$threadsLaunched = 0
$$threadsDone = 0

get '/' do
  puts "#{Thread.list.size} threads"
  for i in 1..20 do
      $threadsLaunched = $threadsLaunched + 1
      puts "Creating thread #{i}"
      Thread.new(i) do |j|
          sleep 120
          puts "Thread #{j} done"
          $threadsDone = $threadsDone + 1
      end
   end
  puts "#{Thread.list.size} threads"
  erb :home
 end

(home.erb)

   <div id="content">

<h1> Threads launched <%= $threadsLaunched.to_s %> </h1>
<h1> Threads running <%= Thread.list.count.to_s %> </h1>
<h1> Threads done <%= $threadsDone.to_s %> </h1>

 </div> <!--  id="content" -->