How to Demonstrate Java Multithreading Visibility Problems

How to demonstrate java multithreading visibility problems?

By modifying the example here by removing operations I have come up with an example that consistently fails in my environment (the thread never stops running).

// Java environment:
// java version "1.6.0_0"
// OpenJDK Runtime Environment (IcedTea6 1.6.1) (6b16-1.6.1-3ubuntu3)
// OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
public class Test2 extends Thread {
    boolean keepRunning = true;
    public static void main(String[] args) throws InterruptedException {
        Test2 t = new Test2();
        t.start();
        Thread.sleep(1000);
        t.keepRunning = false;
        System.out.println(System.currentTimeMillis() + ": keepRunning is false");
    }
    public void run() {
        while (keepRunning) 
        {}
    }
}

Note that this type of problems are quite dependent on the compiler/runtime/system. In particular the compiler can determine to add instructions to read the variable from memory even if it is not volatile --so the code would work--, the vm and jit can optimize away the reads from memory and use only registers, and even the processor can reorder instructions --that would not affect this case, but in other multithreaded cases it can affect the perceived state from other threads if more than one variable are modified.

Need help understanding memory visibility issues while Multithreading in java

How often does the value in the cpu's cache get flushed/synced with main memory?

Undefined. Cache flushing happens when the visibility guarantees specified in the JLS say that it needs to happen.

Can the value be not synced with the main memory is that also a possibility?

Yes.

Why would this happen?

Generally speaking, caches get flushed for a reason. The happens-before relationships indicate the places where a cache flush may be necessary.

Does this memory visibility happen also when there is only once cpu core and one cpu cache or does it happen always?

If there is only one core, then cache flushing is not an issue¹.

I am having some trouble understanding the memory visibility problem though I understand race conditions and deadlocks. Is this something architecture specific?

Yes and no. The memory visibility may manifest differently depending on the hardware architecture among other things, but the way to write your code to give well-defined behavior is architecture independent.

If you really need a deep understanding of the memory visibility problem, you need to understand the Memory Model. It is described in laymans terms in Goetz et al Chapter 16, and specified in the JLS.

I want to know why the Thread.yield() call will not yield to the main thread and then the main thread should flush to the main memory

The Thread.yield() may yield to another runnable thread. However, by the time that yield() is called, it is quite likely that the main thread is no longer runnable. (Or it may still be running.)
The yield() does not create a happens-before between any statements in the main and child threads. Absent that happens-before relation, the runtime is not obliged to ensure that the result of the assignment by the main thread is visible to the child thread.
While Thread.yield() might perform a cache flush², it would be a flush of the child thread's caches, not the parent thread's caches.

Hence, the child thread's loop may continue indefinitely.

^{1 - Actually, that may be an over-simplification. For example, in a system with one core and multiple hyperthreads with their own caches, cache flushing would be needed.}

^{2 - For example, if the yield() does result in a context switch, then the context switch typically includes a cache flush as part of the thread state saving performed by the OS. However, yield() won't necessary result in a context switch. And besides, this aspect is not specified by the JLS.}

Simulating Field-visibility problem in Java

Your example doesn't work because System.out.println() uses a shared resource (System.out), so it will synchronize with other uses of the same resource.

Therefore you will never* see a result where one thread uses the old value of the other. (*in theory it is possible for the reader to read x between x++ and the corresponding System.out.println()

Here is an example where a old value is used:

public class ThreadVisibility implements Runnable {

    private boolean stop = false;

    @Override
    public void run() {
        while (!stop);
    }

    public static void main(String[] args) throws InterruptedException {
        ThreadVisibility test = new ThreadVisibility();
        Thread t = new Thread(test);
        t.setDaemon(true);
        System.out.println("Starting Thread");
        t.start();
        Thread.sleep(1000);
        System.out.println("Stopping Thread");
        test.stop = true;
        t.join(1000);
        System.out.println("Thread State: " + t.getState());
    }
}

If you run this code, it will display that the thread is still running at the end. Without the t.setDaemon(true), the VM would wait for the Thread to finish, which would never happen.

If you comment out the Thread.sleep, then the new Thread may terminate (at least it did in my tests), but it is not guaranteed to.

The right fix for this problem is to declare stop volatile.

Or add a memory barrier.

What is the underlying cause of thread visibility issues in Java?

As explained in the question you pointed out, the variable without the volatile keyword may be optimized in any way by the compiler,JIT, processor, so that there is no read from the shared memory. So the variable may be stored in a CPU register for the thread and it ends with having two separate variables (one by thread) having different values.

This explains why the volatile keyword is useful. This very same piece of code may or may not work as expected without this keyword.

The where is the cache question is difficult to answer since I think, there is no good answer. It may be on any layer.

Edit: the link posted by @polygnome in comment is great. You should definitely read it.

Visibility effects of synchronization in Java

No, there is no transitive relationship.

The idea behind the JMM is to define rules that JVM must respect. Providing the JVM follows these rules, they are authorized to reorder and execute code as they want.

In your example, the 2nd read and the 3rd read are not related - no memory barrier introduced by the use of synchronized or volatile for example. Thus, the JVM is allowed to execute it as follow:

 public Helper getHelper() {
    final Helper toReturn = helper;  // "3rd" read, reading null
    if (helper == null) {            // First read of helper
      synchronized (this) {
        if (helper == null) {        // Second read of helper
          helper = new Helper(42);
        }
      }
    }

    return toReturn; // Returning null
  }

Your call would then return a null value. Yet, a singleton value would have been created. However, sub-sequent calls may still get a null value.

As suggested, using a volatile would introduce new memory barrier. Another common solution is to capture the read value and return it.

 public Helper getHelper() {
    Helper singleton = helper;
    if (singleton == null) {
      synchronized (this) {
        singleton = helper;
        if (singleton == null) {
          singleton = new Helper(42);
          helper = singleton;
        }
      }
    }

    return singleton;
  }

As your rely on a local variable, there is nothing to reorder. Everything is happening in the same thread.

Java thread visibility - best practice for visibility without explicit synchronization

Don't be afraid of synchronization, if the synchronized block is small and fast, which is true in your case, lock-unlock won't cause any problems.

You could use a "lock-free" concurrent data structure, like ConcurrentLinkedQueue. (It is not going to be a lot faster for your case)

EDIT: it looks like you are worrying about the visibility of the argument passed to handle(), not handlers themselves.

Understanding Java volatile visibility

Visibility, for modern CPUs is guaranteed by cache coherence protocol (e.g. MESI) anyway, so what can volatile help here?

That doesn't help you. You aren't writing code for a modern CPU, you are writing code for a Java virtual machine that is allowed to have a virtual machine that has a virtual CPU whose virtual CPU caches are not coherent.

Some articles say volatile variable uses memory directly instead of CPU cache, which guarantees visibility between threads. That doesn't sound a correct explain.

That is correct. But understand, that's with respect to the virtual machine that you are coding for. Its memory may well be implemented in your physical CPU's caches. That may allow your machine to use the caches and still have the memory visibility required by the Java specification.

Using volatile may ensure that writes go directly to the virtual machine's memory instead of the virtual machine's virtual CPU cache. The virtual machine's CPU cache does not need to provide visibility between threads because the Java specification doesn't require it to.

You cannot assume that characteristics of your particular physical hardware necessarily provide benefits that Java code can use directly. Instead, the JVM trades off those benefits to improve performance. But that means your Java code doesn't get those benefits.

Again, you are not writing code for your physical CPU, you are writing code for the virtual CPU that your JVM provides. That your CPU has coherent caches allows the JVM to do all kinds of optimizations that boost your code's performance, but the JVM is not required to pass those coherent caches through to your code and real JVM's do not. Doing so would mean eliminating a significant number of extremely valuable optimizations.

How to Demonstrate Java Multithreading Visibility Problems