Finding Number of Cores in Java

int cores = Runtime.getRuntime().availableProcessors();

If cores is less than one, either your processor is about to die, or your JVM has a serious bug in it, or the universe is about to blow up.

How to scale threads according to CPU cores?

You can determine the number of processes available to the Java Virtual Machine by using the static Runtime method, availableProcessors. Once you have determined the number of processors available, create that number of threads and split up your work accordingly.

Update: To further clarify, a Thread is just an Object in Java, so you can create it just like you would create any other object. So, let's say that you call the above method and find that it returns 2 processors. Awesome. Now, you can create a loop that generates a new Thread, and splits the work off for that thread, and fires off the thread. Here's some pseudocode to demonstrate what I mean:

int processors = Runtime.getRuntime().availableProcessors();
for(int i=0; i < processors; i++) {
  Thread yourThread = new AThreadYouCreated();
  // You may need to pass in parameters depending on what work you are doing and how you setup your thread.
  yourThread.start();
}

For more information on creating your own thread, head to this tutorial. Also, you may want to look at Thread Pooling for the creation of the threads.

Is there a way in Java to find out how many CPUs (or cores) are installed?

You can use

Runtime.getRuntime().availableProcessors()

But its more of a best guess and even mentioned by the API

This value may change during a particular invocation of the virtual
machine. Applications that are sensitive to the number of available
processors should therefore occasionally poll this property and adjust
their resource usage appropriately.

Spark: get number of cluster cores programmatically

There are ways to get both the number of executors and the number of cores in a cluster from Spark. Here is a bit of Scala utility code that I've used in the past. You should easily be able to adapt it to Java. There are two key ideas:

The number of workers is the number of executors minus one or sc.getExecutorStorageStatus.length - 1.
The number of cores per worker can be obtained by executing java.lang.Runtime.getRuntime.availableProcessors on a worker.

The rest of the code is boilerplate for adding convenience methods to SparkContext using Scala implicits. I wrote the code for 1.x years ago, which is why it is not using SparkSession.

One final point: it is often a good idea to coalesce to a multiple of your cores as this can improve performance in the case of skewed data. In practice, I use anywhere between 1.5x and 4x, depending on the size of data and whether the job is running on a shared cluster or not.

import org.apache.spark.SparkContext

import scala.language.implicitConversions

class RichSparkContext(val sc: SparkContext) {

  def executorCount: Int =
    sc.getExecutorStorageStatus.length - 1 // one is the driver

  def coresPerExecutor: Int =
    RichSparkContext.coresPerExecutor(sc)

  def coreCount: Int =
    executorCount * coresPerExecutor

  def coreCount(coresPerExecutor: Int): Int =
    executorCount * coresPerExecutor

}

object RichSparkContext {

  trait Enrichment {
    implicit def enrichMetadata(sc: SparkContext): RichSparkContext =
      new RichSparkContext(sc)
  }

  object implicits extends Enrichment

  private var _coresPerExecutor: Int = 0

  def coresPerExecutor(sc: SparkContext): Int =
    synchronized {
      if (_coresPerExecutor == 0)
        sc.range(0, 1).map(_ => java.lang.Runtime.getRuntime.availableProcessors).collect.head
      else _coresPerExecutor
    }

}

Update

Recently, getExecutorStorageStatus has been removed. We have switched to using SparkEnv's blockManager.master.getStorageStatus.length - 1 (the minus one is for the driver again). The normal way to get to it, via env of SparkContext is not accessible outside of the org.apache.spark package. Therefore, we use an encapsulation violation pattern:

package org.apache.spark

object EncapsulationViolator {
  def sparkEnv(sc: SparkContext): SparkEnv = sc.env
}

Java threads and number of cores

Processes vs Threads

In days of old, each process had precisely one thread of execution, so processes were scheduled onto cores directly (and in these old days, there was almost only one core to schedule onto). However, in operating systems that support threading (which is almost all moderns OS's), it is threads, not processes that are scheduled. So for the rest of this discussion we will talk exclusively about threads, and you should understand that each running process has one or more threads of execution.

Parallelism vs Concurrency

When two threads are running in parallel, they are both running at the same time. For example, if we have two threads, A and B, then their parallel execution would look like this:

CPU 1: A ------------------------->

CPU 2: B ------------------------->

When two threads are running concurrently, their execution overlaps. Overlapping can happen in one of two ways: either the threads are executing at the same time (i.e. in parallel, as above), or their executions are being interleaved on the processor, like so:

CPU 1: A -----------> B ----------> A -----------> B ---------->

So, for our purposes, parallelism can be thought of as a special case of concurrency*

Scheduling

But we are able to produce a thread pool(lets say 30) with a larger number than the number of cores that we posses(lets say 4) and have them run concurrently. How is this possible if we are only have 4 cores?

In this case, they can run concurrently because the CPU scheduler is giving each one of those 30 threads some share of CPU time. Some threads will be running in parallel (if you have 4 cores, then 4 threads will be running in parallel at any one time), but all 30 threads will be running concurrently. The reason you can then go play games or browse the web is that these new threads are added to the thread pool/queue and also given a share of CPU time.

Logical vs Physical Cores

According to my current understanding, a core can only perform 1 process at a time

This is not quite true. Due to very clever hardware design and pipelining that would be much too long to go into here (plus I don't understand it), it is possible for one physical core to actually be executing two completely different threads of execution at the same time. Chew over that sentence a bit if you need to -- it still blows my mind.

This amazing feat is called simultaneous multi-threading (or popularly Hyper-Threading, although that is a proprietary name for a specific instance of such technology). Thus, we have physical cores, which are the actual hardware CPU cores, and logical cores, which is the number of cores the operating system tells software is available for use. Logical cores are essentially an abstraction. In typical modern Intel CPUs, each physical core acts as two logical cores.

can someone explain how this works and also recommend some good reading on this?

I would recommend Operating System Concepts if you really want to understand how processes, threads, and scheduling all work together.

The precise meanings of the terms parallel and concurrent are hotly debated, even here in our very own stack overflow. What one means by these terms depends a lot on the application domain.

Does the decision of how many cores to use stay in the hands of the JVM?

Actually, it is typically the Operating System that decides how many cores that a Java application gets to use. The scheduling of native threads to cores is handled by the operating system¹. The JVM has little (if any) say thread scheduling.

But yes, an Java application won't necessarily get access to all of the cores. It will depend on what other applications, services, etc on system are doing. Indeed, the OS may provide ways for an administrator to externally limit the number of cores that may be used by a given (Java or not) application, or give one application priority over another.

... or there is no correlation between the number of thread and used core's

There is a correlation, but not one that is particularly useful. A JVM won't (cannot) use more cores than there are native threads in existence (including the JVM's internal and GC threads).

It is also worth noting that the OS (typically) doesn't assign a core to a native thread that is not currently runnable.

Basil Bourque notes in his answer that Project Loom will bring significant improvements to threading in Java, if and when it is incorporated into the standard releases. However, Loom won't alter the fact that the number of physical cores assigned an application JVM at any given time is controlled / limited by the OS.

^{1 - Or the operating system's hypervisor. Things can get a bit complicated in a cloud computing environment, for instance.}

Finding Number of Cores in Java