How to Ensure That a Process Runs in a Specific Physical CPU Core and Thread

How to control which core a process runs on?

As others have mentioned, processor affinity is Operating System specific. If you want to do this outside the confines of the operating system, you're in for a lot of fun, and by that I mean pain.

That said, others have mentioned SetProcessAffinityMask for Win32. Nobody has mentioned the Linux kernel way to set processor affinity, and so I shall. You need to use the sched_setaffinity(2) system call. Here's a nice tutorial on how.

The command-line wrapper for this system call is taskset(1). e.g.

taskset -c 2,3 perf stat awk 'BEGIN{for(i=0;i<100000000;i++){}}' restricts that perf-stat of a busy-loop to running on either of core 2 or 3 (still allowing it to migrate between cores, but only between those two).

How can I see which CPU core a thread is running in?

The answer below is no longer accurate as of 2014

Tasks don't sleep in any particular core. And the scheduler won't know ahead of time which core it will run a thread on because that will depend on future usage of those cores.

To get the information you want, look in /proc/<pid>/task/<tid>/status. The third field will be an 'R' if the thread is running. The sixth from the last field will be the core the thread is currently running on, or the core it last ran on (or was migrated to) if it's not currently running.

31466 (bc) S 31348 31466 31348 34819 31466 4202496 2557 0 0 0 5006 16 0 0 20 0 1 0 10196934 121827328 1091 18446744073709551615 4194304 4271839 140737264235072 140737264232056 217976807456 0 0 0 137912326 18446744071581662243 0 0 17 3 0 0 0 0 0

Not currently running. Last ran on core 3.

31466 (bc) R 31348 31466 31348 34819 31466 4202496 2557 0 0 0 3818 12 0 0 20 0 1 0 10196934 121827328 1091 18446744073709551615 4194304 4271839 140737264235072 140737264231824 4235516 0 0 0 2 0 0 0 17 2 0 0 0 0 0

Currently running on core 2.

To see what the rest of the fields mean, have a look at the Linux kernel source -- specifically the do_task_stat function in fs/proc/array.c or Documentation/filesystems/stat.txt.

Note that all of this information may be obsolete by the time you get it. It was true at some point between when you made the open call on the file in proc and when that call returned.

When 2 threads would be executed on a 1 physical CPU core with a multi-core CPU machine?

Two cases:

1) The other physical cores are busy doing other stuff, so only one core gets used by this process. The two threads run in alternation on that core.

2) The physical core supports executing more than one thread concurrently using hyperthreading or something similar. The other physical cores are busy doing other stuff, so the best the scheduler can do is run both threads in a single physical core.

Determine on which physical processor my code is currently running

Threads will often switch from processor to processor, so it's kind of meaningless, but you can use GetCurrentProcessorNumber.

As others have said, you can use GetProcessAffinityMask or GetThreadIdealProcessor, but those will only work if you've already set an affinity mask or ideal processor for the thread.

Identify processor (core) is used by specific thread

Unless you use thread-affinity, threads are not assigned to specific cores. With every time slice, the thread can be executed on different cores. This means that if there would be a function to get the core of a thread, by the time you get the return value, there's a big chance that the thread is already executing on another core.

If you are using thread-affinity, you could take a look at the Windows thread-affinity functions (http://msdn.microsoft.com/en-us/library/ms684847%28v=VS.85%29.aspx).

How do SMP cores, processes, and threads work together exactly?

Cores (or CPUs) are the physical elements of your computer that execute code. Usually, each core has all necessary elements to perform computations, register files, interrupt lines etc.

Most operating systems represent applications as processes. This means that the application has its own address space (== view of memory), where the OS makes sure that this view and its content are isolated from other applications.

A process consists of one or more threads, which carry out the real work of an application by executing machine code on a CPU. The operating system determines, which thread executes on which CPU (by using clever heuristics to improve load balance, energy consumption etc.). If your application consists only of a single thread, then your whole multi-CPU-system won't help you much as it will still only use one CPU for your application. (However, overall performance may still improve as the OS will run other applications on the other CPUs so they don't intermingle with the first one).

Now to your specific questions:

1) The OS usually allows you to at least give hints about on which core you want to execute certain threads. What OpenMP does is to generate code that spawns a certain amount of threads to distribute shared computational work from loops of your program in multiple threads. It can use the OS's hint mechanism (see: thread affinity) to do so.
However, OpenMP applications will still run concurrently to others and thus the OS is free to interrupt one of the threads and schedule other (potentially unrelated) work on a CPU.
In reality, there are many different scheduling schemes you might want to apply depending on your situation, but this is highly specific and most of the time you should be able to trust your OS doing the right thing for you.

2) Even if you are running a single-threaded application on a multi-core CPU, you notice other CPUs doing work as well. This comes a) from the OS doing its job in the meantime and b) from the fact that your application is never running alone -- each running system consists of a whole bunch of concurrently executing tasks. Check Windows' task manager (or ps/top on Linux) to check what is running.