How to Stop God from Leaving Stale Resque Worker Processes

How to stop God from leaving stale Resque worker processes?

You need to tell god to use pid file generated by rescue and set pid file

w.env = {'PIDFILE' => '/path/to/resque.pid'}
w.pid_file = '/path/to/resque.pid'

env will tell rescue to write pid file, and pid_file will tell god to use it

also as svenfuchs noted it should be enough to set only proper env:

w.env = { 'PIDFILE' => "/home/travis/.god/pids/#{w.name}.pid" }

where /home/travis/.god/pids is the default pids directory

How do I clear stuck/stale Resque workers?

None of these solutions worked for me, I would still see this in redis-web:

0 out of 10 Workers Working

Finally, this worked for me to clear all the workers:

Resque.workers.each {|w| w.unregister_worker}

God stop resque workers rake

The solution is to send a SIGQUIT to the process
so for example you can run

god signal resque SIGQUIT

God-Resque Process in Limbo

I ended up putting redis on the same box as the workers and they have been functioning properly since.

How do I write a Resque condition that says if a process is running for longer than n seconds, kill it?

As it turns out, there is an example of how to do this in some sample resque files. It's not exactly what I was looking for since it doesn't add an on.condition(:foo), but it is a viable solution:

# This will ride alongside god and kill any rogue stale worker
# processes. Their sacrifice is for the greater good.

WORKER_TIMEOUT = 60 * 10 # 10 minutes

Thread.new do
loop do
begin
`ps -e -o pid,command | grep [r]esque`.split("\n").each do |line|
parts = line.split(' ')
next if parts[-2] != "at"
started = parts[-1].to_i
elapsed = Time.now - Time.at(started)

if elapsed >= WORKER_TIMEOUT
::Process.kill('USR1', parts[0].to_i)
end
end
rescue
# don't die because of stupid exceptions
nil
end

# Sleep so we don't run too frequently
sleep 30
end
end

gracefully stop execution a resque job without fail status?

To stop execution without reporting a failed status:

raise Resque::Job::DontPerform

in a before_perform hook.

This isn't well documented, but you can find a reference in the code.

Update: Resque hooks documentation

Rubygem God: Time limit configuration for process

You could do something where your job creates a pid file, which you can use the FileMtime condition in god to monitor. When the job is finished, it recreates the pid file, if the file is older than x, restart the process with god.

source: https://github.com/mojombo/god/blob/856d321fb135a0b453046e99c266231681bd5ffe/lib/god/conditions/file_mtime.rb

Edit: Added github source



Related Topics



Leave a reply



Submit