Setting CPU Affinity for Linux Kernel, Not Process

setting cpu affinity for linux kernel, not process

Kernel work on behalf of processes will always happen on the CPU that makes the request. You can steer interrupts, though. Look at /proc/interrupts to identify the interrupts you want to move (say everything matching eth0) and set the affinity by echoing a hexadecimal mask to /proc/irq/XXX/smp_affinity.

Set cpu affinity on a loadable linux kernel module

cpu affinity is pretty meaningless in terms of kernel module, as far as i can see you need to traverse cpus one by one to initialize PM.

like so:

for_each_cpu(cpu, mask) 
  include/linux/cpumask.h +152

setting cpu affinity of a process from the start on linux

taskset can be used both to set the affinity of a running process or to launch a process with a certain affinity, see

How to launch your application in a specific CPU in Linux (CPU affinity)?.
man page for taskset

Synopsis

taskset [options] mask command [arg]...
taskset [options] -p [mask] pid

The below command will launch Google Chrome browser in CPU 1 & 2 (or 0 and 1). The mask is 0×00000003 and command is “google-chrome”.

taskset 0×00000003 google-chrome

Setting CPU Affinity and blocking CPU usage for background task

Setting affinity for a process means that, only that process can run on a particular CPU. However, that CPU is still in use by kernel scheduler and kernel can schedule other processes to it.
One option would be to use isolcpus kernel command line option like isolcpus=0-3. Now, 0-3 cpus will not be used by kernel and setting a process affinity to 0-3 cpus will now execute only your process and nothing else.

Processor affinity settings for Linux kernel modules?

I think you'll probably have to modify the kernel, but the change isn't too rough. Just export sched_setaffinity in sched.c to modules:

  long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
  {
    ...
  }
+ EXPORT_SYMBOL_GPL(sched_setaffinity); // Exported, now callable from your code.

How to set the affinity of a process from a Linux kernel mode?

sched_setaffinity is not exported to modules.

If you modify /usr/src/linux/kernel/sched.c, you can cause sched_setaffinity to be exported to modules.

 long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
 {
...
 }
+EXPORT_SYMBOL_GPL(sched_setaffinity);

cpu affinity, allowing only process to run on a specific cpu

I've discussed this in a series of comments to the original question, but I think it's "the answer to the underlying problem" rather than a specific answer to your specific question, so here we go:

I've heard this question a couple of times, and it was always being
asked out of a misunderstanding of how simultaneous multiprocessing
works. First of all: Why do you need your process on core #0? Do you
have a reason?

typically, the linux kernel is quite efficient at scheduling tasks to
processors in a manner that minimizes the negative effects that either
process migration or a single-core-bottleneck would bring. In fact,
it's very rare to see someone actually gain performance by manually
setting affinities, and that usually only happens when the userland
process is closely communicating with a kernel module which for some
hardware or implementation reason is inherently single-threaded and
can't easily be migrated.

Your question itself shows a minor misunderstanding: Affinity means
that the kernel knows that it should schedule that process on the
given core. Hence, other processes will automatically be migrated away
from that core and only be running there if your desired task leaves a
lot of that core unused. To change the "priority" your process has in
CPU allocation, just change its nice value.

The reason is performance measurement by isolating the running process. Regarding the second comment, that mean we have to rely on OS
scheduler because it will not run a background process on a core which
is currently utilized 100% while there are idle cores.

what you're measuring is not the performance of the process
isolatedly, but of the process, bound to a single CPU! For a
single-threaded process that's ok, but imagine that your process might
have multiple threads -- these all would normally run on different
cores, and overall performance would be much higher. Generally, try
just minimizing non-process workloads on your machine (ie. run without
a window manager/session manager running, stop all non-essential
services) and use a very small nice value -- that measurement might be
relatively precise.

Also, the time command allows you to know how much time a process
spend totally (including waiting), occupying CPU as userland process,
and occupying CPU in system calls -- I think that might fit your needs
well enough :)

Is CPU affinity enforced across system calls?

Syscalls are really just your process code switching from user to kernel mode. The task that is being run does not change at all, it just temporarily enters kernel mode to execute the syscall and then returns back to user mode.

A task can be preempted by the scheduler and moved to a different CPU, and this can happen in the middle of normal user mode code or even in the middle of a syscall.

By setting the task affinity to a single CPU using sched_setaffinity(), you remove this possibility, since even if the task gets preempted, the scheduler has no choice but to keep it running on the same CPU (it may of course change the currently running task, but when your task resumes it will still be on the same CPU).

So to answer your question:

does that same system call enforce kernel-space code in that process context will execute on the same core as well?

Yes, it does.

Now, to address @Barmar's comment: in the case of syscalls that can "sleep", this does not mean that the task could change CPU if the affinity does not allow it.

What happens when a syscall sleeps, is simply that the syscall code tells the scheduler: "hey, I'm waiting for something, just run another task while I wait and wake me up later". When the syscall resumes, it checks if the requested resource is available (it could even tell the kernel exactly when it wants to be waken up), and if not it either waits again or returns to user code saying "sorry, I got nothing, try again". The resource could of course be made available by some interrupt that causes an interrupt handler to run on a different CPU, but that's a different story, and it doesn't really matter. To put it simply: interrupt code does not run in process context, at all. For what the task executing the syscall is concerned, the resource is just magically there when execution resumes.

Setting CPU Affinity for Linux Kernel, Not Process