Deceive the JVM about the number of available cores (on linux)
The following Java program prints the number of processors as seen by the Java VM:
public class AvailableProcessors {
public static void main(String... args) {
System.out.println(Runtime.getRuntime().availableProcessors());
}
}
If I execute this program on my home computer, it prints 4
, which is the actual number of cores (including hyper threading). Now let's trick the Java VM into believing there are only two processors:
$ echo '0-1' > /tmp/online
$ mount --bind /tmp/online /sys/devices/system/cpu/online
If I run the above program again, it prints 2
instead of 4
.
This trick affects all processes on your system. However, it's possible to restrict the effect only to certain processes. Each process on Linux can have its own namespace of mount points. See for example the section Pre-process namespaces in the man page of mount(2). You can for example use lxc to start new processes with their own mount namespace.
Java threads and number of cores
Processes vs Threads
In days of old, each process had precisely one thread of execution, so processes were scheduled onto cores directly (and in these old days, there was almost only one core to schedule onto). However, in operating systems that support threading (which is almost all moderns OS's), it is threads, not processes that are scheduled. So for the rest of this discussion we will talk exclusively about threads, and you should understand that each running process has one or more threads of execution.
Parallelism vs Concurrency
When two threads are running in parallel, they are both running at the same time. For example, if we have two threads, A and B, then their parallel execution would look like this:
CPU 1: A ------------------------->
CPU 2: B ------------------------->
When two threads are running concurrently, their execution overlaps. Overlapping can happen in one of two ways: either the threads are executing at the same time (i.e. in parallel, as above), or their executions are being interleaved on the processor, like so:
CPU 1: A -----------> B ----------> A -----------> B ---------->
So, for our purposes, parallelism can be thought of as a special case of concurrency*
Scheduling
But we are able to produce a thread pool(lets say 30) with a larger number than the number of cores that we posses(lets say 4) and have them run concurrently. How is this possible if we are only have 4 cores?
In this case, they can run concurrently because the CPU scheduler is giving each one of those 30 threads some share of CPU time. Some threads will be running in parallel (if you have 4 cores, then 4 threads will be running in parallel at any one time), but all 30 threads will be running concurrently. The reason you can then go play games or browse the web is that these new threads are added to the thread pool/queue and also given a share of CPU time.
Logical vs Physical Cores
According to my current understanding, a core can only perform 1 process at a time
This is not quite true. Due to very clever hardware design and pipelining that would be much too long to go into here (plus I don't understand it), it is possible for one physical core to actually be executing two completely different threads of execution at the same time. Chew over that sentence a bit if you need to -- it still blows my mind.
This amazing feat is called simultaneous multi-threading (or popularly Hyper-Threading, although that is a proprietary name for a specific instance of such technology). Thus, we have physical cores, which are the actual hardware CPU cores, and logical cores, which is the number of cores the operating system tells software is available for use. Logical cores are essentially an abstraction. In typical modern Intel CPUs, each physical core acts as two logical cores.
can someone explain how this works and also recommend some good reading on this?
I would recommend Operating System Concepts if you really want to understand how processes, threads, and scheduling all work together.
- The precise meanings of the terms parallel and concurrent are hotly debated, even here in our very own stack overflow. What one means by these terms depends a lot on the application domain.
Java VisualVM CPU usage and processor affinity
According to VisualVM source code, CPU usage is indeed calculated as total CPU time divided by number of processors:
long processCpuTime = tracksProcessCpuTime ?
model.getProcessCpuTime() / processorsCount : -1;
where processorsCount is obtained from OperatingSystemMXBean:
OperatingSystemMXBean osbean = mxbeans.getOperatingSystemMXBean();
if (osbean != null) processorsCount = osbean.getAvailableProcessors();
There was a long-standing JVM bug JDK-6515172, that the process affinity was not taken into account, i.e. getAvailableProcessors always returned the total number of CPUs regardless of tasksets. This was specific to Linux and BSD; worked normally on Solaris and Windows.
About a month ago this bug has been finally resolved. The fix, however, is only for JDK 9.
Look at this question for possible workarounds. They are somewhat ugly though.
Is it possible to check in Java if the CPU is hyper threading?
For Windows
, if the number of logical cores is higher than the number of cores, you have hyper-threading
enabled. Read more about it here.
You can use wmic
to find this information:
C:\WINDOWS\system32>wmic CPU Get NumberOfCores,NumberOfLogicalProcessors /Format:List
NumberOfCores=4
NumberOfLogicalProcessors=8
Hence, my system has hyper-threading
. The amount of logical processors is double the cores.
But you may not even need to know. Runtime.getRuntime().availableProcessors()
already returns the amount of logical processors.
A full example on getting the physical cores count (Windows
only):
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
public class PhysicalCores
{
public static void main(String[] arguments) throws IOException, InterruptedException
{
int physicalNumberOfCores = getPhysicalNumberOfCores();
System.out.println(physicalNumberOfCores);
}
private static int getPhysicalNumberOfCores() throws IOException, InterruptedException
{
ProcessBuilder processBuilder = new ProcessBuilder("wmic", "CPU", "Get", "NumberOfCores");
processBuilder.redirectErrorStream(true);
Process process = processBuilder.start();
String processOutput = getProcessOutput(process);
String[] lines = processOutput.split(System.lineSeparator());
return Integer.parseInt(lines[2]);
}
private static String getProcessOutput(Process process) throws IOException, InterruptedException
{
StringBuilder processOutput = new StringBuilder();
try (BufferedReader processOutputReader = new BufferedReader(
new InputStreamReader(process.getInputStream())))
{
String readLine;
while ((readLine = processOutputReader.readLine()) != null)
{
processOutput.append(readLine);
processOutput.append(System.lineSeparator());
}
process.waitFor();
}
return processOutput.toString().trim();
}
}
Java (prior to JDK8 update 131) applications running in docker container CPU / Memory issues?
Linux container support first appeared in JDK 10 and then ported to 8u191, see JDK-8146115.
Earlier versions of the JVM obtained the number of available CPUs as following.
Prior to 8u121, HotSpot JVM relied on
sysconf(_SC_NPROCESSORS_ONLN)
libc call. In turn, glibc read the system file/sys/devices/system/cpu/online
. Therefore, in order to fake the number of available CPUs, one could replace this file using a bind mount:echo 0-3 > /tmp/online
docker run --cpus 4 -v /tmp/online:/sys/devices/system/cpu/online ...To set only one CPU, write
echo 0
instead ofecho 0-3
Since 8u121 the JVM became taskset aware. Instead of
sysconf
, it started callingsched_getaffinity
to find the CPU affinity mask for the process.This broke bind mount trick. Unfortunately, you can't fake
sched_getaffinity
the same way assysconf
. However, it is possible to replace libc implementation ofsched_getaffinity
using LD_PRELOAD.
I wrote a small shared library proccount that replaces both sysconf
and sched_getaffinity
. So, this library can be used to set the right number of available CPUs in all JDK versions before 8u191.
How it works
First, it reads
cpu.cfs_quota_us
andcpu.cfs_period_us
to find if the container is launched with--cpus
option. If both are above zero, the number of CPUs is estimated ascpu.cfs_quota_us / cpu.cfs_period_us
Otherwise it reads
cpu.shares
and estimates the number of available CPUs ascpu.shares / 1024
Such CPU calculation is similar to how it actually works in a modern container-aware JDK.
The library defines (overrides)
sysconf
andsched_getaffinity
functions to return the number of processors obtained in (1) or (2).
How to compile
gcc -O2 -fPIC -shared -olibproccount.so proccount.c -ldl
How to use
LD_PRELOAD=/path/to/libproccount.so java <args>
Related Topics
How to Store One Billion Files on Ext4
How to Check Hz in the Terminal
Svn: Ignoring an Already Committed File
Filtering Rows Based on Number of Columns with Awk
What Is the Purpose of Features.H Header
Shell Function to Tail a Log File for a Specific String for a Specific Time
Gdb Does Not Hit Any Breakpoints When I Run It from Inside Docker Container
Linux Combine Two Files by Column
Installing Gnuplot 5.0 on Ubuntu
Linux Synchronization with Fifo Waiting Queue
How to Find Out What Linux Capabilities a Process Requires to Work
Can 'Connect' Call on Socket Return Successfully Without Server Calling 'Accept'
How to Remove Only the First Occurrence of a Line in a File Using Sed
How to Determine the Process Memory Limit in Linux
Understanding Sendfile() and Splice()