Kworker threads getting blocked by SCHED_RR userspace threads
The root-cause for this behaviour is tty_flip_buffer_push()
In kernel/drivers/tty/tty_buffer.c:518
,
tty_flip_buffer_push schedules an asynchronous task. This is soon executed asynchronously by a kworker thread.
However, if any realtime threads execute on the system and keep it busy then the chance that the kworker thread will execute soon is very less. Eventually once the RT threads relinquish CPU or RT-throttling is triggerred, it might eventually provide the kworker thread a chance to execute.
Older kernels support the low_latency
flag within the TTY sub-system.
Prior to Linux kernel v3.15 tty_flip_buffer_push()
honored the low_latency
flag of the tty port.
If the low_latency
flag was set by the UART driver as follows (typically in its .startup()
function),
t->uport.state->port.tty->low_latency = 1;
then tty_flip_buffer_push()
perform a synchronous copy in the context of the current function call itself. Thus it automatically inherits the priority of the current task i.e. there is no chance of a priority inversion incurred by asynchronously scheduling a work task.
Note: If the serial driver sets the
low_latency
flag, it must avoid callingtty_flip_buffer_push()
within an ISR(interrupt context). With thelow_latency
flag set,tty_flip_buffer_push()
does NOT use separate workqueue, but directly calls the functions. So if called within an interrupt context, the ISR will take longer to execute. This will increase latency of other parts of the kernel/system. Also under certain conditions (dpeending on how much data is available in the serial buffer)tty_flip_buffer_push()
may attempt to sleep (acquire a mutex). Calling sleep within an ISR in the Linux kernel causes a kernel bug.
With the workqueue implementation within the Linux kernel having migrated to CMWQ,
it is no longer possible to deterministically obtain independent execution contexts
(i.e. separate threads) for individual workqueues.
All workqueues in the system are backed by kworker/*
threads in the system.
NOTE: THIS SECTION IS OBSOLETE!!
Leaving the following intact as a reference for older versions of the Linux kernel.
Customisations for low-latency/real-time UART/TTY:
1. Create and use a personal workqueue for the TTY layer.
Create a new workqueue in tty_init().
A workqueue created with create_workqueue()
will have 1 worker thread for each CPU on the system.
struct workqueue_struct *create_workqueue(const char *name);
Using create_singlethread_workqueue()
instead, creates a workqueue with a single kworker process
struct workqueue_struct *create_singlethread_workqueue(const char *name);
2. Use the private workqueue.
Queue the flip buffer work on the above private workqueue instead of the kernel's global global workqueue.
int queue_work(struct workqueue_struct *queue, struct work_struct *work);
Replace schedule_work()
with queue_work()
in functions called by tty_flip_buffer_push().
3. Tweak the execution priority of the private workqueue.
Upon boot the kworker thread being used by TTY layer workqueue can be identified by the string name
used while creating it. Set an appropriate higher RT priority using chrt
upon this thread as required by the system design.
Can I have realtime scheduling within my process (but without affecting others)?
Threads created with PTHREAD_SCOPE_PROCESS
will share the same kernel thread (
http://lists.freebsd.org/pipermail/freebsd-threads/2006-August/003674.html )
However, SCHED_RR must be run under a root-privileged process.
Round-Robin; threads whose contention scope is system
(PTHREAD_SCOPE_SYSTEM) are in real-time (RT) scheduling class if the
calling process has an effective user id of 0. These threads, if not
preempted by a higher priority thread, and if they do not yield or
block, will execute for a time period determined by the system.
SCHED_RR for threads that have a contention scope of process
(PTHREAD_SCOPE_PROCESS) or whose calling process does not have an
effective user id of 0 is based on the TS scheduling class.
However, basing on your linked problem I think you are facing a deeper issue. Have you tried setting your kernel to be more "preemptive"? Preemption should allow the kernel to forcibly schedule out of running your process allowing for more responsive running of some kernel parts. This shouldn't affect IRQs though, maybe something disabled your IRQs?
Another thing I am thinking about is maybe that you are not fetching your SPI data fast enough and the buffor for your data in the kernel becomes full and hence the data loss. Try increasing those buffers also.
Can I have realtime scheduling within my process (but without affecting others)?
Threads created with PTHREAD_SCOPE_PROCESS
will share the same kernel thread (
http://lists.freebsd.org/pipermail/freebsd-threads/2006-August/003674.html )
However, SCHED_RR must be run under a root-privileged process.
Round-Robin; threads whose contention scope is system
(PTHREAD_SCOPE_SYSTEM) are in real-time (RT) scheduling class if the
calling process has an effective user id of 0. These threads, if not
preempted by a higher priority thread, and if they do not yield or
block, will execute for a time period determined by the system.
SCHED_RR for threads that have a contention scope of process
(PTHREAD_SCOPE_PROCESS) or whose calling process does not have an
effective user id of 0 is based on the TS scheduling class.
However, basing on your linked problem I think you are facing a deeper issue. Have you tried setting your kernel to be more "preemptive"? Preemption should allow the kernel to forcibly schedule out of running your process allowing for more responsive running of some kernel parts. This shouldn't affect IRQs though, maybe something disabled your IRQs?
Another thing I am thinking about is maybe that you are not fetching your SPI data fast enough and the buffor for your data in the kernel becomes full and hence the data loss. Try increasing those buffers also.
Thread (not processes) scheduling based on priority
There's no problem, this is expected behavior.
First, if you have more than one core, then the priorities won't matter if there are fewer ready-to-run threads than cores -- each thread will get its own core.
Second, your high-priority thread sleeps, which gives the lower-priority thread time to run.
Third, your threads interact through the lock that protects standard output. The higher-priority thread can be waiting for that lock, allowing lower-priority threads to run.
Please don't try to use priorities this way. It adds massive complexity, hurts performance, and rarely accomplishes anything useful.
Kernel preemption on same priority tasks
General scheduling order is
- Kernel invokes function
schedule()
either directly in kernel context or whenTIF_NEED_RESCHED
flag is set and kernel returning from interrupt context. - This function invokes
pick_next_task()
to receive the task, which will preempt currently running one. pick_next_task()
invokes every scheduler class'pick_next_task()
in order of descending priority until one of them returns task. Note, that priority means class' priority (e.g. soft real-time or normal), not process' one.- The CFS's approach (scheduler for normal processes) is to give each process an equal amount of virtual run-time. Virtual run-time is a process' real run-time weighted with its priority (process' priority). So CFS class returns task with lesser virtual runtime.
For scheduler, there is no matter what process are doing, what massages it sends or receives. So, in general case, if your processes have equal priorities, process with less run-time will preempt another process on the next schedule()
invokation.
Related Topics
Can Someone Explain About Linux Library Naming
How to Perform Rgb->Yuv Conversion in C/C++
How to Use Ms Code Coverage Tool in Command Line
How to Format a Datetime to String Using Boost
How Is Push_Back Implemented in Stl Vector
Boost Interprocess Mutexes and Checking for Abandonment
Constructor with By-Value Parameter & Noexcept
Static Initialization and Destruction of a Static Library's Globals Not Happening with G++
Cannot Open Windows.H in Microsoft Visual Studio
Documenting Preprocessor Defines in Doxygen
Cudamalloc of a Structure and an Element of Same Structure
Warning: Narrowing Conversion C++11
Getting Cannot Allocate Memory Error
How to Start Process on Linux Os in C, C++