When to use linux kernel add_timer vs queue_delayed_work
As I stated in my question, queue_delayed_work
just uses add_timer
internally. So the use is equally.
When to use kernel threads vs workqueues in the linux kernel
As you said, it depends on the task at hand:
Work queues defer work into a kernel thread - your work will always run in process
context. They are schedulable and can therefore sleep.
Normally, there is no debate between work queues or sotftirqs/tasklets; if the deferred work needs to sleep, work queues are used, otherwise softirqs or tasklets are used. Tasklets are also more suitable for interrupt handling (they are given certain assurances such as: a tasklet is never ran later than on the next tick, it's always serialized with regard to itself, etc.).
Kernel timers are good when you know exactly when you want something to happen, and do not want to interrupt/block a process in the meantime. They run outside process context, and they are also asynchronous with regard to other code, so they're the source of race conditions if you're not careful.
Hope this helps.
CPU Handling with Delayed Work
As you can see, queue_delayed_work
will set cpu
argument to WORK_CPU_UNBOUND
. This value is defined to be bigger than the actual number of CPUs supported by the kernel. This value is passed to __queue_delayed_work
that, if delay
is non zero, will use timers (using add_timer
function to fire a callback function delayed_work_timer_fn
after specified time (this callback function is defined at work queue initialization). All this callback function does is to call __queue_work
, still passing WORK_CPU_UNBOUND
as cpu
argument. So the whole "magic" is happening there.
This function will check if the cpu
argument is set to WORK_CPU_UNBOUND
and choose cpu to be the current processor:
if (req_cpu == WORK_CPU_UNBOUND)
cpu = raw_smp_processor_id()
So the work will be executed on the processor which handles the timer interrupt set before. Now I didn't study the timer code but IIRC from LDD3 book, timer interrupts will be handled by the CPU they were registered on (unless this CPU will be disabled in the meantime, of course, in which case the timer IRQ will be moved to other CPU) but that book is old some this may not be true any more.
There is another hint in the kernel code that should prove what I wrote - see the comments of queue_work
function that says: "We queue the work to the CPU on which it was submitted, but if the CPU dies it can be processed by another CPU". This function also uses WORK_CPU_UNBOUND
as a cpu argument.
Timer migration details
As stated before, if some processor goes down, it can no longer handle IRQs, thus it wont be able to handle timers that it has registered. Because of that, kernel will migrate all pending timers to other CPUs when CPU is going offline. This task is done by migrate_timers()
function which is run by timer_cpu_notify
that in turn is a callback registered as cpu_notifier
.
migrate_timers
is run when cpu state is changed to CPU_DEAD
or CPU_DEAD_FROZEN
. This state is set inside of _cpu_down
function by calling:
cpu_notify_nofail(CPU_DEAD | mod, hcpu);
It is called after __cpu_die(cpu)
which ensures the CPU we were disabling is no longer working so we can be sure this code runs on some other CPU. migrate_timers
will reassign all timers to the CPU its running on.
So where is the decision on which CPU should takeover timers done? One could say that it's done by scheduler:
If you call
cpu_down
on different CPU than the one you want to disable, then this is the CPU that will takeover.If you call
cpu_down
on the CPU that is going to be disabled, it will schedule itself out in__cpu_die
and the rest of the code will then be rescheduled on some other CPU.
Is do_timer() supposed to be called on only one core in SMP systems?
Let me answer my own question after googling and reading code.
do_timer()
is supposed to be called on cpu with ID kept in tick_do_timer_cpu
variable.
kernel/time/tick-common.c
/*
* tick_do_timer_cpu is a timer core internal variable which holds the CPU NR
* which is responsible for calling do_timer(), i.e. the timekeeping stuff.This
* variable has two functions:
*
* 1) Prevent a thundering herd issue of a gazillion of CPUs trying to grab the
* timekeeping lock all at once. Only the CPU which is assigned to do the
* update is handling it.
*
* 2) Hand off the duty in the NOHZ idle case by setting the value to
* TICK_DO_TIMER_NONE, i.e. a non existing CPU. So the next cpu which looks
* at it will take over and keep the time keeping alive. The handover
* procedure also covers cpu hotplug.
*/
tick_do_timer_cpu
is checked against current CPU ID in tick_periodic()
or in tick_sched_do_timer()
. If current CPU is the same do_timer()
is called otherwise not.
static void tick_periodic(int cpu)
{
if (tick_do_timer_cpu == cpu) {
write_seqlock(&jiffies_lock);
/* Keep track of the next tick event */
tick_next_period = ktime_add(tick_next_period, tick_period);
do_timer(1);
write_sequnlock(&jiffies_lock);
update_wall_time();
}
update_process_times(user_mode(get_irq_regs()));
profile_tick(CPU_PROFILING);
}`
This way jiffies management is done on one core in SMP systems.
How to modify kernel timer_list timeout
Kernel has no mechanism for detect changes in variables. Instead, you should perform corresponded actions before/after your code changes your variable.
When you add sysctl entry, you also set handler for it(ctl_table->proc_handler
). This handler defines actions, which are executed when read/write method for entry is called. Standard proc_do*
functions only set/get value of variable, so you should define your handler. Something like this:
int my_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
// Call standard helper..
int res = proc_dointvec(table, write, buffer, lenp, ppos);
if(write && !res) {
// Additional actions on successfull write.
}
return res;
}
Modification of the timer's timeout can be performed using mod_timer
function.
Related Topics
Howto Use Sed to Remove Only Triple Empty Lines
Deleting All Files Except Ones Mentioned in Config File
How to Implement Highly Accurate Timers in Linux Userspace
Find String and Replace Line in Linux
How Does Iwlist() Command Scans The Wireless Networks
Tail a Log into an Excerpt in Real Time
How to Get Notified of Modification in The Memory in Linux
Kernel Module Build Fails: Sys/Types.H: No Such File or Directory
Time Taken by 'Less' Command to Show Output
How to Handle Sigsegv Signal in Userspace Using Rust
Path Environment Variable in Linux
Postgres Copy Command, Binary File
Extract Unique Block of Lines from a File Using Shell Script
Can a Gnome Application Be Automated? How