How to Prevent a Linux User Space Pthread Yielding in Critical Code

Can I prevent a Linux user space pthread yielding in critical code?

You can use the sched_setscheduler() system call to temporarily set the thread's scheduling policy to SCHED_FIFO, then set it back again. From the sched_setscheduler() man page:

A SCHED_FIFO process runs until either
it is blocked by an I/O request, it is
preempted by a higher priority
process, or it calls sched_yield(2).

(In this context, "process" actually means "thread").

However, this is quite a suspicious requirement. What is the problem you are hoping to solve? If you are just trying to protect your linked list of completion handlers from concurrent access, then an ordinary mutex is the way to go. Have the completion thread lock the mutex, remove the list item, unlock the mutex, then call the completion handler.

How to allow certain threads to have priority in locking a mutex use PTHREADS

As I understand it, the only way you can truly guarantee this would be to write a lock that works like that yourself. However @xryl669's answer that suggests using thread priority and priority inheritance is certainly worthy of consideration if it works for your use case.

To implement it yourself, you will need condition variables and counts of the number of waiting low / high priority threads.

In terms of the concepts and APIs you'll need, it is relatively similar to implementing a read/write lock (but the semantics you need are completely different, obviously - but if you understood how the r/w lock is working, you'll understand how to implement what you want).

You can see an implementation of a read write lock here:

http://ptgmedia.pearsoncmg.com/images/0201633922/sourcecode/rwlock.c

In the lower priority threads, you'd need to wait for high priority threads to finish, in the same way readers wait for writers to finish.

(The book the above code is taken from it also a great posix threads book btw, http://www.informit.com/store/product.aspx?isbn=0201633922 )

Multi-threaded C program much slower in OS X than Linux

MacOSX and Linux implement pthread differently, causing this slow behavior. Specifically MacOSX does not use spinlocks (they are optional according to ISO C standard). This can lead to very, very slow code performance with examples like this one.

Why is there no std:: equivalent to pthread_spinlock_t like there is for pthread_mutex_t & std::mutex?

there doesn't seem to be an equivalent to spinlock in pthreads.

Spinlocks are often considered a wrong tool in user-space because there is no way to disable thread preemption while the spinlock is held (unlike in kernel). So that a thread can acquire a spinlock and then get preempted, causing all other threads trying to acquire the spinlock to spin unnecessarily (and if those threads are of higher priority that may cause a deadlock (threads waiting for I/O may get a priority boost on wake up)). This reasoning also applies to all lockless data structures, unless the data structure is truly wait-free (there aren't many practically useful ones, apart from boost::spsc_queue).

In kernel, a thread that has locked a spinlock cannot be preempted or interrupted before it releases the spinlock. And that is why spinlocks are appropriate there (when RCU cannot be used).

On Linux, one can prevent preemption (not sure if completely, but there has been recent kernel changes towards such a desirable effect) by using isolated CPU cores and FIFO real-time threads pinned to those isolated cores. But that requires a deliberate kernel/machine configuration and an application designed to take advantage of that configuration. Nevertheless, people do use such a setup for business-critical applications along with lockless (but not wait-free) data structures in user-space.


On Linux, there is adaptive mutex PTHREAD_MUTEX_ADAPTIVE_NP, which spins for a limited number of iterations before blocking in the kernel (similar to InitializeCriticalSectionAndSpinCount). However, that mutex cannot be used through std::mutex interface because there is no option to customise non-portable pthread_mutexattr_t before initialising pthread_mutex_t.

One can neither enable process-sharing, robostness, error-checking or priority-inversion prevention through std::mutex interface. In practice, people write their own wrappers of pthread_mutex_t which allows to set desirable mutex attributes; along with a corresponding wrapper for condition variables. Standard locks like std::unique_lock and std::lock_guard can be reused.

IMO, there could be provisions to set desirable mutex and condition variable properties in std:: APIs, like providing a protected constructor for derived classes that would initialize that native_handle, but there aren't any. That native_handle looks like a good idea to do platform specific stuff, however, there must be a constructor for the derived class to be able to initialize it appropriately. After the mutex or condition variable is initialized that native_handle is pretty much useless. Unless the idea was only to be able to pass that native_handle to (C language) APIs that expect a pointer or reference to an initialized pthread_mutex_t.


There is another example of Boost/C++ standard not accepting semaphores on the basis that they are too much of a rope to hang oneself, and that mutex (a binary semaphore, essentially) and condition variable are more fundamental and more flexible synchronisation primitives, out of which a semaphore can be built.

From the point of view of the C++ standard those are probably right decisions because educating users to use spinlocks and semaphores correctly with all the nuances is a difficult task. Whereas advanced users can whip out a wrapper for pthread_spinlock_t with little effort.

When is pthread_spin_lock the right thing to use (over e.g. a pthread mutex)?

The short answer is that a spinlock can be better when you plan to hold the lock for an extremely short interval (for example to do nothing but increment a counter), and contention is expected to be rare, but the operation is occurring often enough to be a potential performance bottleneck. The advantages of a spinlock over a mutex are:

  1. On unlock, there is no need to check if other threads may be waiting for the lock and waking them up. Unlocking is simply a single atomic write instruction.
  2. Failure to immediately obtain the lock does not put your thread to sleep, so it may be able to obtain the lock with much lower latency as soon a it does become available.
  3. There is no risk of cache pollution from entering kernelspace to sleep or wake other threads.

Point 1 will always stand, but point 2 and 3 are of somewhat diminished usefulness if you consider that good mutex implementations will probably spin a decent number of times before asking the kernel for help waiting.

Now, the long answer:

What you need to ask yourself before using spinlocks is whether these potential advantages outweigh one rare but very real disadvantage: what happens when the thread that holds the lock gets interrupted by the scheduler before it can release the lock. This is of course rare, but it can happen even if the lock is just held for a single variable-increment operation or something else equally trivial. In this case, any other threads attempting to obtain the lock will keep spinning until the thread the holds the lock gets scheduled and has a chance to release the lock. This may never happen if the threads trying to obtain the lock have higher priorities than the thread that holds the lock. That may be an extreme case, but even without different priorities in play, there can be very long delays before the lock owner gets scheduled again, and worst of all, once this situation begins, it can quickly escalate as many threads, all hoping to get the lock, begin spinning on it, tying up more processor time, and further delaying the scheduling of the thread that could release the lock.

As such, I would be careful with spinlocks... :-)



Related Topics



Leave a reply



Submit