Linux: Processes and Threads in a Multi-core CPU
I don't know how the (various) Linux scheduler handle this, but inter-thread communication gets more expensive when threads are running on different Cores.
So the scheduler may decide to run threads of a process on the same CPU if there are other processes needing CPU time.
Eg with a Dual-Core CPU, if there are two processes with two threads and all are using all CPU time they get, it is better to run the two threads of the first process on the first Core and the two threads of the other process on the second core.
Is having multiple cores in a CPU for running multiple threads/processes at once, or for instruction-level parallelism?
ILP is purely within each physical core separately.
cross-site duplicate: How does a single thread run on multiple cores?
(It doesn't - each core has multiple execution units and a wide front-end.
Read my linked answer for details that I'm not going to duplicate. See also Modern Microprocessors
A 90-Minute Guide!)
Also, real CPU cores can pretend to be multiple "logical cores" (i.e. each having register context and can call __schedule()
independently). Generically, this is SMT; the mostly widely-known brand-name implementation of that concept is Intel's HyperThreading. Letting multiple software threads share a CPU truly simultaneously (not via software context-switching) gives the core two instruction-streams to find parallelism between as well as within, generally increasing overall throughput (at the cost of single-thread performance to some degree, depending on how busy a single thread of your workload could keep a core).
In some contexts, "CPU" is synonym for a single core. e.g. perf stat
output saying "7.5 CPUs utilized".
But more often, a CPU refers to the whole physical package, e.g. my CPU is a quad-core, an i7-6700k. Server motherboards are often dual-socket, allowing you to plug in two separate multi-core CPUs.
Perhaps that's what created some terminology confusion?
linux - run process on several cores
The Linux kernel scheduler is scheduling tasks. See this answer to a nearly identical question which explains what tasks are.
A task may run (at a given moment) on some single core. The scheduler may move tasks from one core to another (but rarely do so, since it takes time to warm up a core and its L1 cache).
A multi-threaded process usually have several tasks (one per thread) which usually can run on several cores.
You probably should avoid having a big lot of threads per process. I would recommend at most a dozen threads, especially if several of them are runnable (but details vary with hardware and system)
Read also about processor affinity
How do Operating Systems schedule multiple threads on multiple CPU cores simultaneously?
On a symmetric multi processor architecture all CPUs have equal access to all memory. A thread's object code and data is accessible to all cores and processors, so it is easy to "move" a thread / process from core to core. The kernel simply needs to implement a scheduling scheme to ensure that everything that needs to run gets run as best as possible. A thread / process that was interrupted on one core can be resumed on another with very little penalty.
Exactly what that scheduling scheme is varies. There could be a single scheduler task running on a single core that controls what's running on all the other cores. Alternatively there could be a mini-scheduler per core that looks after scheduling on just that core, co-operating with its peers to spread threads around. This is, I think (corrections welcome), what Linux does.
linux taskset: Does a thread of a multi-thread process always run on a particular core?
No, the filter is applied to the whole process and threads can move between (the restricted list of) cores. If you want threads not to move, then you need set the affinity of each thread separately (eg. using pthread_setaffinity_np
for example). Note that you can check the affinity of threads of a given process with the great hwloc tool (hwloc-ps -t
).
Note that some libraries/frameworks have ways to do that more easily. This is the case for OpenMP programs where you can use environment variables like OMP_PLACES
to set the affinity of each thread.
Related Topics
What Is The Current State of Tail-Call-Optimization for F# on Mono (2.11)
Intel Msr Frequency Scaling Per - Thread
Change Owner of a Currently Running Process
How to Get Use Count from Linux Kernel Module
How to Capture Remote System Network Traffic
How to Execute an Arbitrary Script with a Working Directory of The Directory Its In
Passing an Array as Command Line Argument for Linux Kernel Module
Tar Command Changing The Owner:Group While Extracting
How to Suppress Warnings in Qt Creator
How to Set Folder Permissions for a Particular Container on Elastic Beanstalk
Check The Output of "Make" and Exit Bash Script If It Fails
Dos2Unix: Binary Symbol Found, Skipping Binary File
How to Determine The Files Corresponding to a UInput Device