Sidekiq: Ensure all jobs on the queue are unique
My suggestion is to search for prior scheduled jobs based on some select criteria and delete, before scheduling a new one. This has been useful for me when i want a single scheduled job for a particular Object, and/or one of its methods.
Some example methods in this context:
find_jobs_for_object_by_method(klass, method)
jobs = Sidekiq::ScheduledSet.new
jobs.select { |job|
job.klass == 'Sidekiq::Extensions::DelayedClass' &&
((job_klass, job_method, args) = YAML.load(job.args[0])) &&
job_klass == klass &&
job_method == method
}
end
##
# delete job(s) specific to a particular class,method,particular record
# will only remove djs on an object for that method
#
def self.delete_jobs_for_object_by_method(klass, method, id)
jobs = Sidekiq::ScheduledSet.new
jobs.select do |job|
job.klass == 'Sidekiq::Extensions::DelayedClass' &&
((job_klass, job_method, args) = YAML.load(job.args[0])) &&
job_klass == klass &&
job_method == method &&
args[0] == id
end.map(&:delete)
end
##
# delete job(s) specific to a particular class and particular record
# will remove any djs on that Object
#
def self.delete_jobs_for_object(klass, id)
jobs = Sidekiq::ScheduledSet.new
jobs.select do |job|
job.klass == 'Sidekiq::Extensions::DelayedClass' &&
((job_klass, job_method, args) = YAML.load(job.args[0])) &&
job_klass == klass &&
args[0] == id
end.map(&:delete)
end
Sidekiq strange behaviour of unique jobs
Try running it without the sidekiq-unique-jobs
gem. It's only been protecting you against dupes for 30 minutes anyway. That gem sets its hashkeys in Redis to auto-expire after 30 minutes (configurable). sidekiq
itself sets its jobs to auto-expire in Redis after 24 hours.
I obviously don't see your app, but I'll bet you want to not process the same file very often at all. I would control this at the application layer instead and track my own hashkey doing something similar to what the unique-jobs gem is doing:
hash = Digest::MD5.hexdigest(Sidekiq.dump_json(md5_arguments))
It's also possible that the sidekiq-unique-jobs
middleware is also getting in the way of sidekiq
knowing if a job properly completed or not. I'll bet that there aren't a lot of folks testing this with long-running jobs in your same configuration.
If you continue to see this behavior without the additional middleware, give resque
a try. I've never seen this kind of behavior with that gem, and failed jobs have a helpful retry option in the admin GUI.
The main benefit of sidekiq is that it is multi-threaded. Even so, a concurrency of 25 with large video processes might be pushing it a bit. In my experience, forking is more stable and portable, with less worries about your application's thread-safety (YMMV).
Whatever you do, make sure that you are aware of the auto-expiry TTL settings that these systems put on their data in Redis. The size and nature of your jobs means that jobs could easily back up for 24 hours. These automatic deletions happen at the database layer. There are no callbacks to the application layer to warn if a job has been deleted automatically. In the sidekiq
code, for example, they introduced auto-expire behavior to "to avoid any possible leaking." ( reference ) This isn't very encouraging if you really need these jobs to execute.
Avoiding duplicate jobs when using Sidekiq's `unique_for` and `Sidekiq::Limiter.concurrent` in the same worker
One idea:
When the user wants to change Author A, I would enqueue a scheduled, unique UpdateAuthorJob
for Author A which updates their info 10 minutes from now. That way, the user can make lots of changes to the author and the system will wait for that 10 minute cooldown period before performing the actual update work, ensuring that you get all the updates as one group.
Related Topics
Difference Between Plugins and Ruby Gems
Best Ruby Idiom for "Nil or Zero"
How Mix in Routes in Sinatra for a Better Structure
Class Method VS Constant in Ruby/Rails
Ruby - Determine If a Number Is a Prime
Jekyll - Generating JSON Files Alongside the HTML Files
Devise: How to Override Devise Error Messages on Password Change
Hashes of Hashes Idiom in Ruby
Parse CSV File with Header Fields as Attributes for Each Row
How to Use Jquery-Tokeninput and Acts-As-Taggable-On
Full Url for an Image-Path in Rails 3
Eager Loading: the Right Way to Do Things
Ruby on Rails: Debugging Rake Tasks