Linux: Processes and Threads in a Multi-Core Cpu

Linux: Processes and Threads in a Multi-core CPU

I don't know how the (various) Linux scheduler handle this, but inter-thread communication gets more expensive when threads are running on different Cores.

So the scheduler may decide to run threads of a process on the same CPU if there are other processes needing CPU time.

Eg with a Dual-Core CPU, if there are two processes with two threads and all are using all CPU time they get, it is better to run the two threads of the first process on the first Core and the two threads of the other process on the second core.

Is having multiple cores in a CPU for running multiple threads/processes at once, or for instruction-level parallelism?

ILP is purely within each physical core separately.

cross-site duplicate: How does a single thread run on multiple cores?

(It doesn't - each core has multiple execution units and a wide front-end.
Read my linked answer for details that I'm not going to duplicate. See also Modern Microprocessors
A 90-Minute Guide!)

Also, real CPU cores can pretend to be multiple "logical cores" (i.e. each having register context and can call __schedule() independently). Generically, this is SMT; the mostly widely-known brand-name implementation of that concept is Intel's HyperThreading. Letting multiple software threads share a CPU truly simultaneously (not via software context-switching) gives the core two instruction-streams to find parallelism between as well as within, generally increasing overall throughput (at the cost of single-thread performance to some degree, depending on how busy a single thread of your workload could keep a core).


In some contexts, "CPU" is synonym for a single core. e.g. perf stat output saying "7.5 CPUs utilized".

But more often, a CPU refers to the whole physical package, e.g. my CPU is a quad-core, an i7-6700k. Server motherboards are often dual-socket, allowing you to plug in two separate multi-core CPUs.

Perhaps that's what created some terminology confusion?

linux - run process on several cores

The Linux kernel scheduler is scheduling tasks. See this answer to a nearly identical question which explains what tasks are.

A task may run (at a given moment) on some single core. The scheduler may move tasks from one core to another (but rarely do so, since it takes time to warm up a core and its L1 cache).

A multi-threaded process usually have several tasks (one per thread) which usually can run on several cores.

You probably should avoid having a big lot of threads per process. I would recommend at most a dozen threads, especially if several of them are runnable (but details vary with hardware and system)

Read also about processor affinity

How do Operating Systems schedule multiple threads on multiple CPU cores simultaneously?

On a symmetric multi processor architecture all CPUs have equal access to all memory. A thread's object code and data is accessible to all cores and processors, so it is easy to "move" a thread / process from core to core. The kernel simply needs to implement a scheduling scheme to ensure that everything that needs to run gets run as best as possible. A thread / process that was interrupted on one core can be resumed on another with very little penalty.

Exactly what that scheduling scheme is varies. There could be a single scheduler task running on a single core that controls what's running on all the other cores. Alternatively there could be a mini-scheduler per core that looks after scheduling on just that core, co-operating with its peers to spread threads around. This is, I think (corrections welcome), what Linux does.

linux taskset: Does a thread of a multi-thread process always run on a particular core?

No, the filter is applied to the whole process and threads can move between (the restricted list of) cores. If you want threads not to move, then you need set the affinity of each thread separately (eg. using pthread_setaffinity_np for example). Note that you can check the affinity of threads of a given process with the great hwloc tool (hwloc-ps -t).

Note that some libraries/frameworks have ways to do that more easily. This is the case for OpenMP programs where you can use environment variables like OMP_PLACES to set the affinity of each thread.



Related Topics



Leave a reply



Submit