Deceive the Jvm About the Number of Available Cores (On Linux)

Deceive the JVM about the number of available cores (on linux)

The following Java program prints the number of processors as seen by the Java VM:

public class AvailableProcessors {
public static void main(String... args) {
System.out.println(Runtime.getRuntime().availableProcessors());
}
}

If I execute this program on my home computer, it prints 4, which is the actual number of cores (including hyper threading). Now let's trick the Java VM into believing there are only two processors:

$ echo '0-1' > /tmp/online
$ mount --bind /tmp/online /sys/devices/system/cpu/online

If I run the above program again, it prints 2 instead of 4.

This trick affects all processes on your system. However, it's possible to restrict the effect only to certain processes. Each process on Linux can have its own namespace of mount points. See for example the section Pre-process namespaces in the man page of mount(2). You can for example use lxc to start new processes with their own mount namespace.

Java threads and number of cores

Processes vs Threads

In days of old, each process had precisely one thread of execution, so processes were scheduled onto cores directly (and in these old days, there was almost only one core to schedule onto). However, in operating systems that support threading (which is almost all moderns OS's), it is threads, not processes that are scheduled. So for the rest of this discussion we will talk exclusively about threads, and you should understand that each running process has one or more threads of execution.

Parallelism vs Concurrency

When two threads are running in parallel, they are both running at the same time. For example, if we have two threads, A and B, then their parallel execution would look like this:

CPU 1: A ------------------------->

CPU 2: B ------------------------->

When two threads are running concurrently, their execution overlaps. Overlapping can happen in one of two ways: either the threads are executing at the same time (i.e. in parallel, as above), or their executions are being interleaved on the processor, like so:

CPU 1: A -----------> B ----------> A -----------> B ---------->

So, for our purposes, parallelism can be thought of as a special case of concurrency*

Scheduling

But we are able to produce a thread pool(lets say 30) with a larger number than the number of cores that we posses(lets say 4) and have them run concurrently. How is this possible if we are only have 4 cores?

In this case, they can run concurrently because the CPU scheduler is giving each one of those 30 threads some share of CPU time. Some threads will be running in parallel (if you have 4 cores, then 4 threads will be running in parallel at any one time), but all 30 threads will be running concurrently. The reason you can then go play games or browse the web is that these new threads are added to the thread pool/queue and also given a share of CPU time.

Logical vs Physical Cores

According to my current understanding, a core can only perform 1 process at a time

This is not quite true. Due to very clever hardware design and pipelining that would be much too long to go into here (plus I don't understand it), it is possible for one physical core to actually be executing two completely different threads of execution at the same time. Chew over that sentence a bit if you need to -- it still blows my mind.

This amazing feat is called simultaneous multi-threading (or popularly Hyper-Threading, although that is a proprietary name for a specific instance of such technology). Thus, we have physical cores, which are the actual hardware CPU cores, and logical cores, which is the number of cores the operating system tells software is available for use. Logical cores are essentially an abstraction. In typical modern Intel CPUs, each physical core acts as two logical cores.

can someone explain how this works and also recommend some good reading on this?

I would recommend Operating System Concepts if you really want to understand how processes, threads, and scheduling all work together.

  • The precise meanings of the terms parallel and concurrent are hotly debated, even here in our very own stack overflow. What one means by these terms depends a lot on the application domain.

Java VisualVM CPU usage and processor affinity

According to VisualVM source code, CPU usage is indeed calculated as total CPU time divided by number of processors:

    long processCpuTime = tracksProcessCpuTime ?
model.getProcessCpuTime() / processorsCount : -1;

where processorsCount is obtained from OperatingSystemMXBean:

    OperatingSystemMXBean osbean = mxbeans.getOperatingSystemMXBean();
if (osbean != null) processorsCount = osbean.getAvailableProcessors();

There was a long-standing JVM bug JDK-6515172, that the process affinity was not taken into account, i.e. getAvailableProcessors always returned the total number of CPUs regardless of tasksets. This was specific to Linux and BSD; worked normally on Solaris and Windows.

About a month ago this bug has been finally resolved. The fix, however, is only for JDK 9.

Look at this question for possible workarounds. They are somewhat ugly though.

Is it possible to check in Java if the CPU is hyper threading?

For Windows, if the number of logical cores is higher than the number of cores, you have hyper-threading enabled. Read more about it here.

You can use wmic to find this information:

C:\WINDOWS\system32>wmic CPU Get NumberOfCores,NumberOfLogicalProcessors /Format:List


NumberOfCores=4
NumberOfLogicalProcessors=8

Hence, my system has hyper-threading. The amount of logical processors is double the cores.

But you may not even need to know. Runtime.getRuntime().availableProcessors() already returns the amount of logical processors.

A full example on getting the physical cores count (Windows only):

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

public class PhysicalCores
{
public static void main(String[] arguments) throws IOException, InterruptedException
{
int physicalNumberOfCores = getPhysicalNumberOfCores();
System.out.println(physicalNumberOfCores);
}

private static int getPhysicalNumberOfCores() throws IOException, InterruptedException
{
ProcessBuilder processBuilder = new ProcessBuilder("wmic", "CPU", "Get", "NumberOfCores");
processBuilder.redirectErrorStream(true);
Process process = processBuilder.start();
String processOutput = getProcessOutput(process);
String[] lines = processOutput.split(System.lineSeparator());
return Integer.parseInt(lines[2]);
}

private static String getProcessOutput(Process process) throws IOException, InterruptedException
{
StringBuilder processOutput = new StringBuilder();

try (BufferedReader processOutputReader = new BufferedReader(
new InputStreamReader(process.getInputStream())))
{
String readLine;

while ((readLine = processOutputReader.readLine()) != null)
{
processOutput.append(readLine);
processOutput.append(System.lineSeparator());
}

process.waitFor();
}

return processOutput.toString().trim();
}
}

Java (prior to JDK8 update 131) applications running in docker container CPU / Memory issues?

Linux container support first appeared in JDK 10 and then ported to 8u191, see JDK-8146115.

Earlier versions of the JVM obtained the number of available CPUs as following.

  • Prior to 8u121, HotSpot JVM relied on sysconf(_SC_NPROCESSORS_ONLN) libc call. In turn, glibc read the system file /sys/devices/system/cpu/online. Therefore, in order to fake the number of available CPUs, one could replace this file using a bind mount:

    echo 0-3 > /tmp/online
    docker run --cpus 4 -v /tmp/online:/sys/devices/system/cpu/online ...

    To set only one CPU, write echo 0 instead of echo 0-3

  • Since 8u121 the JVM became taskset aware. Instead of sysconf, it started calling sched_getaffinity to find the CPU affinity mask for the process.

    This broke bind mount trick. Unfortunately, you can't fake sched_getaffinity the same way as sysconf. However, it is possible to replace libc implementation of sched_getaffinity using LD_PRELOAD.

I wrote a small shared library proccount that replaces both sysconf and sched_getaffinity. So, this library can be used to set the right number of available CPUs in all JDK versions before 8u191.

How it works

  1. First, it reads cpu.cfs_quota_us and cpu.cfs_period_us to find if the container is launched with --cpus option. If both are above zero, the number of CPUs is estimated as

    cpu.cfs_quota_us / cpu.cfs_period_us
  2. Otherwise it reads cpu.shares and estimates the number of available CPUs as

    cpu.shares / 1024

    Such CPU calculation is similar to how it actually works in a modern container-aware JDK.

  3. The library defines (overrides) sysconf and sched_getaffinity functions to return the number of processors obtained in (1) or (2).

How to compile

gcc -O2 -fPIC -shared -olibproccount.so proccount.c -ldl

How to use

LD_PRELOAD=/path/to/libproccount.so java <args>


Related Topics



Leave a reply



Submit