Linux - Threads and Process Scheduling Priorities

Linux - threads and process scheduling priorities

Linux no longer schedules processes at all.

Within the kernel, threads are scheduled. The concept of a process is now an artificial construct seen mostly by things outside the kernel. Obviously, the kernel has to know how threads are tied together, but not for scheduling purposes.

Basically, the kernel maintains a whole lot of threads and each thread has a thread group leader, which is what's seen on the outside as the process. A thread has a thread ID and a thread group ID - it's a lot like the relationship between a PID and a PPID (process ID and parent process ID).

When you create a regular thread, the kernel gives it a brand new thread ID but its thread group ID is set identical to the group ID of the thread that created it. That way, it looks like a thread within a process to the outside world.

When you fork, the kernel gives it a brand new thread ID and sets its thread group ID to the same value as its thread ID. That way, it looks like a process to the outside world.

Most non-kernel utilities that report on processes are really just reporting on threads where the thread ID is the same as the thread group ID.

There are subtleties with other methods which are probably too complicated to go into here. What I've written above is (hopefully) a good medium level treatise.

Now, for your specific question, it would be neither case since P1 only has one thread (there is no P1T2).

Withing the kernel, the threads are P1T1, P2T1 and P2T2 and, assuming they have the same scheduling properties and behave the same ^(a), that's how they'll be scheduled.

Linux process and threads scheduling

You can actually test this with a simple program, but from various man pages:

sched_setaffinity:

A child created via fork(2) inherits its parent's CPU affinity mask.
The affinity mask is preserved across an execve(2).

pthread_create:

The new thread inherits copies of the calling thread's capability sets
(see
capabilities(7)) and CPU affinity mask (see sched_setaffinity(2)).

sched_setscheduler:

Child processes inherit the scheduling policy and parameters across a
fork(2).
The scheduling policy and parameters are preserved across execve(2).

How are nice priorities and scheduler policies related to process (thread?) IDs in linux?

Thread IDs come from the same namespace as PIDs. This means that each thread is invididually addressable by its TID - some system calls do apply to the entire process (for example, kill) but others apply only to a single thread.

The scheduler system calls are generally in the latter class, because this allows you to give different threads within a process different scheduler attributes, which is often useful.

Why would Linux allow thread to set scheduling policy and priority?

Think about it the other way around: How would you ever set scheduling policies and priorities if the OS didn't provide you means to do it? Any tool for the user / administrator to do these things needs such an API.

Of course, you need privileges for many operations, like setting realtime scheduling policies and higher priorities. As always, root (uid 0) can do anything, but there's a much more fine-grained control through capabilities (a process that has CAP_SYS_NICE is allowed to do anything) and resource limits that allow access up to a given priority. For details, read sched(7), the section "Privileges and resource limits".

If you attempt to change anything you don't have the privilege for, sched_setscheduler() will just return -1 end set errno to EPERM.

How is fairness of thread scheduling ensured across processes?

Does it also take into account which Process a particular thread belong to? Otherwise, it seems too easy for a process to hog all the CPU by creating more threads.

Wrong question. Consider two jobs that are trying to solve the exact same problem by doing the same work and are perfectly identical except for one thing -- one uses dozens of threads, the other uses dozens of processes. Why should the one that uses dozens of processes get more CPU time than the one that uses dozens of threads?

Your notion of fairness is not really a sensible one.

Instead, scheduling is more designed around trying to get as much work done as possible per unit time. The assumption is that everything the computer is doing is useful and it benefits competing tasks to have other tasks competing with them finish as quickly as possible too.

This is actually all you need the vast majority of the time. But occasionally you have special situations where this doesn't work. One is ultra-high-priority tasks like keeping video or audio flowing or keeping a user interface responsive. Another is ultra-low-priority tasks where there's an enormous amount of work you want done and you don't want the system to be slow for a long time while you're working on it. Priorities are used for this, and generally the system allows higher-priority threads to interrupt lower-priority ones to keep responsiveness.

Whether the cpu scheduling is based on processes or threads in linux?

Partially based on quantum which is an amount of basic unit of time the thread will execute for. Also I believe there is a priority level so multiple threads are competing for time on the cpu. They wait in line with other threads of same priority level and then run till they are out of quantum. Then they are sent to the back. It's not exact answer but a high level summary.

Also I'm more familar with windows but I think it is same in principles. The process is not an executable code but a unit of storage. So it would be by thread. Linux I read has a more complicated scheduling algorithm than windows (more overhead possibly as a trade off), but it is completely possible I speculate that threads of same process compete for cpu time. The difference is there is no necessary context switch cause the thread sharing process use same address space.

This would explain the diminished returns when using more threads than the physical number of cores(threads on intel). The threads of a process have a small chance of ever running at the same time. Instead they compete. So if you have 4000 threads it means the time any single one of them is running is reduced by 1/4000. However if you were to use the 4000 threads to operate on a single synchronous problem, using a shared storage to load current state you could then get performance gain by having a larger percentage of cpu time as the probability of any of the 4000 threads running is higher.

Linux - Threads and Process Scheduling Priorities