Knowing When Resque Worker Had Completed Job

How to alert user when a resque job finishes

If you intend to notify a user asynchronously, you have three options:

  1. Set a flag in the database, and then change your UI the next time they log in. (Like popping in an alert view or something.)
  2. Send them an email at the conclusion of the job's run. Just use the standard ActionMailer stuff for this.
  3. If you're using a pubsub framework like Faye or Juggernaut, push the user a notification that the job is done. If they're signed in they should see it immediately.

Those are pretty much the only things I can think of. I would personally go with a combination of 1 and 2; send them an email on completion, and notify them (and provide a link in the notification) the next time they log in.

How to check Resque worker status to determine whether it's dead or stale

The only way to determine whether a worker is actually working is to check on the host machine of the worker. After a restart on Heroku, this machines no longer exists so if the worker didn't unregister itself Resque will believe it still to be working. The decentralized nature of Resque workers means that you can't easily check the actual status of the workers. When each workers is started it registers itself with redis. When that worker picks up a job and starts working it again registers it status with redis. When you iterate like so:

Resque.workers.each { |w| w.working? }

you are pulling a list of workers from redis and checking the last registered state of those workers form redis. It doesn't actually query the worker itself.

The hostnames in the resque-web display will match up with the names you see in heroku log output so that's one not very good way to see what's actually running. I was hoping one could automate by using the dyno IDs obtained form the platform API but they don't match the hostnames.

Make sure that you are gracefully handling Resque::TermException as specified in this document. You could also look into some of the heartbeat solutions others have come up with to work around this problem. I've had issues where even using TERM_CHILD and proper signal handling leaves stale workers floating around. My solution has been to wait until no jobs are being processed, unregister all workers, then restart with heroku ps:restart worker.

How to find out if a set of Resque jobs have finished?

I'm not going to paste any code here because it looks like you're pretty familiar with how jobs should be written but think about the following idea:

You should have a resque job which gets and array of ids to process.
This resque jobs creates a single resque job for each of the ids so they will be processed in parallel.
After creating the resque jobs for specific ids it will create another resque jobs which gets the list of ids and periodically checks on them to see if they're finished. When they're done, this job will send the user an email.

With this paradigm you enjoy all of the worlds:

  1. Each ID has its own job.
  2. Parallel execution.
  3. Status check is done on the background thus the user won't get timeout.

UPDATE:

A pseudo code for the check resque job:

class CheckerJob
@queue = :long

def self.perform(ids)
finished = []
while finished.size < ids.size
ids.each do |id|
finished << id if job_finished?(id)
end
sleep 10
end
send_email_to_user
end
end

Now all you left to do is to implement both job_finished?(id) and send_email_to_user methods.

Find out if a resque job is still running and kill it if it's stuck

Resque github repository has this secret gem, a god task that will do exactly this: watch your tasks and kill stale ones.

https://github.com/resque/resque/blob/master/examples/god/stale.god

# This will ride alongside god and kill any rogue stale worker
# processes. Their sacrifice is for the greater good.

WORKER_TIMEOUT = 60 * 10 # 10 minutes

Thread.new do
loop do
begin
`ps -e -o pid,command | grep [r]esque`.split("\n").each do |line|
parts = line.split(' ')
next if parts[-2] != "at"
started = parts[-1].to_i
elapsed = Time.now - Time.at(started)

if elapsed >= WORKER_TIMEOUT
::Process.kill('USR1', parts[0].to_i)
end
end
rescue
# don't die because of stupid exceptions
nil
end

sleep 30
end
end

How to show the resque worker status to the end user?

Have you looked into the resque-status gem? The gem will give you a hash that you can query for the status of the job. Next, you'll need to figure out the best way to notify the user.

Personally, I think the most straight forward method would be to just send an email when the job is complete. If you desire to notify the user in their web browser, you'll probably need to implement some sort of pub/sub system that fires off a notification to alert the browser. This is reasonably complicated, so just sending an email is probably your best option.



Related Topics



Leave a reply



Submit