linux - run process on several cores
The Linux kernel scheduler is scheduling tasks. See this answer to a nearly identical question which explains what tasks are.
A task may run (at a given moment) on some single core. The scheduler may move tasks from one core to another (but rarely do so, since it takes time to warm up a core and its L1 cache).
A multi-threaded process usually have several tasks (one per thread) which usually can run on several cores.
You probably should avoid having a big lot of threads per process. I would recommend at most a dozen threads, especially if several of them are runnable (but details vary with hardware and system)
Read also about processor affinity
Is having multiple cores in a CPU for running multiple threads/processes at once, or for instruction-level parallelism?
ILP is purely within each physical core separately.
cross-site duplicate: How does a single thread run on multiple cores?
(It doesn't - each core has multiple execution units and a wide front-end.
Read my linked answer for details that I'm not going to duplicate. See also Modern Microprocessors
A 90-Minute Guide!)
Also, real CPU cores can pretend to be multiple "logical cores" (i.e. each having register context and can call __schedule()
independently). Generically, this is SMT; the mostly widely-known brand-name implementation of that concept is Intel's HyperThreading. Letting multiple software threads share a CPU truly simultaneously (not via software context-switching) gives the core two instruction-streams to find parallelism between as well as within, generally increasing overall throughput (at the cost of single-thread performance to some degree, depending on how busy a single thread of your workload could keep a core).
In some contexts, "CPU" is synonym for a single core. e.g. perf stat
output saying "7.5 CPUs utilized".
But more often, a CPU refers to the whole physical package, e.g. my CPU is a quad-core, an i7-6700k. Server motherboards are often dual-socket, allowing you to plug in two separate multi-core CPUs.
Perhaps that's what created some terminology confusion?
Bash: Running the same program over multiple cores
You could use
for f in *.fa; do
myProgram (options) "./$f" "./$f.tmp" &
done
wait
which would start all of you jobs in parallel, then wait until they all complete before moving on. In the case where you have more jobs than cores, you would start all of them and let your OS scheduler worry about swapping processes in an out.
One modification is to start 10 jobs at a time
count=0
for f in *.fa; do
myProgram (options) "./$f" "./$f.tmp" &
(( count ++ ))
if (( count = 10 )); then
wait
count=0
fi
done
but this is inferior to using parallel
because you can't start new jobs as old ones finish, and you also can't detect if an older job finished before you manage to start 10 jobs. wait
allows you to wait on a single particular process or all background processes, but doesn't let you know when any one of an arbitrary set of background processes complete.
How to run processes piped with bash on multiple cores?
Suppose dostuff
is running on one CPU. It writes data into a pipe, and that data will be in cache on that CPU. Because filterstuff
is reading from that pipe, the scheduler decides to run it on the same CPU, so that its input data is already in cache.
If your kernel is built with CONFIG_SCHED_DEBUG=y
,
# echo NO_SYNC_WAKEUPS > /sys/kernel/debug/sched_features
should disable this class of heuristics. (See /usr/src/linux/kernel/sched_features.h
and /proc/sys/kernel/sched_*
for other scheduler tunables.)
If that helps, and the problem still happens with a newer kernel, and it's really faster to run on separate CPUs than one CPU, please report the problem to the Linux Kernel Mailing List so that they can adjust their heuristics.
Run processes use two CPU in different terminals
You can run 2 or more commands even on the same terminal with "taskset"
From the man pages (http://linuxcommand.org/man_pages/taskset1.html):
taskset is used to set or retrieve the CPU affinity of a running pro-
cess given its PID or to launch a new COMMAND with a given CPU affin-
ity. CPU affinity is a scheduler property that "bonds" a process to a
given set of CPUs on the system. The Linux scheduler will honor the
given CPU affinity and the process will not run on any other CPUs.
Note that the Linux scheduler also supports natural CPU affinity: the
scheduler attempts to keep processes on the same CPU as long as practi-
cal for performance reasons. Therefore, forcing a specific CPU affin-
ity is useful only in certain applications.
@eddiem already shared the link (http://xmodulo.com/run-program-process-specific-cpu-cores-linux.html) on how to install taskset and that link also explains how to run it
In short:
$taskset 0x1 tar -xzvf test.tar.gz
That would send the tar command to run on CPU 0
If you want to run several commands/scripts in the same terminal using different CPUs then I think that you just could send them to the background appending "&" at the end e.g.
$taskset 0x1 tar -xzvf test.tar.gz &
How to program so that different processes run on different CPU cores?
If you wish to pin threads/processes to specific CPUs then you have to use the sched_setaffinity(2)
system call or the pthread_setaffinity_np(3)
library call for that. Each core in Linux has it's own virtual CPU ID.
These calls allow you to set the allowed CPU mask.
Otherwise it will be up to the digression of the scheduler to run your threads where it feels like running them.
Neither will guarantee that your process runs in parallel though. This is something only the scheduler can decide unless you run realtime.
Here is some sample code:
#include <sched.h>
int run_on_cpu(int cpu) {
cpu_set_t allcpus;
CPU_ZERO(&allcpus);
sched_getaffinity(0, sizeof(cpu_set_t), &allcpus);
int num_cpus = CPU_COUNT(&allcpus);
fprintf(stderr, "%d cpus available for scheduling\nAvailable CPUs: ", num_cpus);
size_t i;
for (i = 0; i < CPU_SETSIZE; i++) {
if (CPU_ISSET(i, &allcpus))
fprintf(stderr, "%zu, ", i);
}
fprintf(stderr, "\n");
if (CPU_ISSET(cpu, &allcpus)) {
cpu_set_t cpu_set;
CPU_ZERO(&cpu_set);
CPU_SET(cpu, &cpu_set);
return pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpu_set);
}
return -1;
}
Related Topics
Dos2Unix: Binary Symbol Found, Skipping Binary File
Change a String in a File with Sed
Setting Environment Variable with Leading Digit in Bash
Dynamic Listening Ports Inside Docker Container
Programmatically Disable Hardware Prefetching on Amd Systems
Linux Kernel Changing Default CPU Scheduler
How Linux Scheduler Schedules Processes on Multi-Core Processors
Convert Charset from a Entire Project to Utf-8
Upgrading PHPmyadmin (And Other Packages) on Debian Squeeze
Golang Os/Exec, Realtime Memory Usage
How to Send a Mail with a Message in Unix Script
Qemu Hosting Mte Enabled Kernel Does Not Raise Fault
Sed Only The Last Match Pattern