Java multi-threading & Safe Publication
Proportionally, it's probably fair to say that very few programmers sufficiently understand synchronization and concurrency. Who knows how many server applications there are out there right now managing financial transactions, medical records, police records, telephony etc etc that are full of synchronization bugs and essentially work by accident, or very very occasionally fail (never heard of anybody get a phantom phone call added to their telephone bill?) for reasons that are never really looked into or gotten to the bottom of.
Object publication is a particular problem because it's often overlooked, and it's a place where it's quite reasonable for compilers to make optimisations that could result in unexpected behaviour if you don't know about it: in the JIT-compiled code, storing a pointer, then incrementing it and storing the data is a very reasonable thing to do. You might think it's "evil", but at a low level, it's really how you'd expect the JVM spec to be. (Incidentally, I've heard of real-life programs running in JRockit suffering from this problem-- it's not purely theoretical.)
If you know that your application has synchronization bugs but isn't misbehaving in your current JVM on your current hardware, then (a) congratulations; and (b), now is the time to start "walking calmly towards the fire exit", fixing your code and educating your programmers before you need to upgrade too many components.
Confused about safe publishing and visibility in Java, especially with Immutable Objects
This question has been answered a few times before but I feel that many of those answers are inadequate. See:
- https://stackoverflow.com/a/14617582
- https://stackoverflow.com/a/35169705
- https://stackoverflow.com/a/7887675
- Effectively Immutable Object
- etc...
In short, Goetz's statement in the linked JSR 133 FAQ page is more "correct", although not in the way that you are thinking.
When Goetz says that immutable objects are safe to use even when published without synchronization, he means to say that immutable objects that are visible to different threads are guaranteed to retain their original state/invariants, all else remaining the same. In other words, properly synchronized publication is not necessary to maintain state consistency.
In the JSR-133 FAQ, when he says that:
you want to ensure that it is seen correctly by all of the other thread (sic)
He is not referring to the state of the immutable object. He means that you must synchronize publication in order for another thread to see the reference to the immutable object. There's a subtle difference to what the two statements are talking about: while JCIP is referring to state consistency, the FAQ page is referring to access to a reference of an immutable object.
The code sample you provided has nothing, really, to do with anything that Goetz says here, but to answer your question, a correctly initializedfinal
field will hold its expected value if the object is properly initialized (beware the difference between initialization and publication). The code sample also synchronizes access to the locations
field so as to ensure updates to the final
field are thread-safe.
In fact, to elaborate further, I suggest that you look at JCIP listing 3.13 (VolatileCachedFactorizer
). Notice that even though OneValueCache
is immutable, that it is stored in a volatile
field. To illustrate the FAQ statement, VolatileCachedFactorizer
will not work correctly without volatile
. "Synchronization" is referring to using a volatile
field in order to ensure that updates made to it are visible to other threads.
A good way to illustrate the first JCIP statement is to remove volatile
. In this case, the CachedFactorizer
won't work. Consider this: what if one thread set a new cache value, but another thread tried to read the value and the field was not volatile
? The reader might not see the updated OneValueCache
. BUT, recalling that Goetz refers to the state of the immutable object, IF the reader thread happened to see an up-to-date instance of OneValueCache
stored at cache
, then the state of that instance would be visible and correctly constructed.
So although it is possible to lose updates to cache
, it is impossible to lose the state of the OneValueCache
if it is read, because it is immutable. I suggest reading the accompanying text stating that "volatile reference used to ensure timely visibility."
As a final example, consider a singleton that uses FinalWrapper
for thread safety. Note that FinalWrapper is effectively immutable (depending on whether the singleton is mutable), and that the helperWrapper
field is in fact non-volatile. Recalling the second FAQ statement, that synchronization is required for access the reference, how can this "correct" implementation possibly be correct!?
In fact, it is possible to do this here because it is not necessary for threads to immediately see the up-to-date value for helperWrapper
. If the value that is held by helperWrapper
is non-null, then great! Our first JCIP statement guarantees that the state of FinalWrapper
is consistent, and that we have a fully initialized Foo
singleton that can be readily returned. If the value is actually null, there are 2 possibilities: firstly, it is possible that it is the first call and it has not been initialized; secondly, it could just be a stale value.
In the case that it is the first call, the field itself is checked again in a synchronized context, as suggested by the second FAQ statement. It will find that this value is still null, and will initialize a new FinalWrapper
and publish with synchronization.
In the case that it is just a stale value, by entering the synchronized block, the thread can setup a happens-before order with a preceding write to the field. By definition, if a value is stale, then some writer has already written to the helperWrapper
field, and that the current thread just has not seen it yet. By entering into the synchronized block, a happens-before relationship is established with that previous write, since according to our first scenario, a truly uninitialized helperWrapper
will be initialized by the same lock. Therefore, it can recover by rereading once the method has entered a synchronized context and obtain the most up-to-date, non-null value.
I hope that my explanations and the accompanying examples that I have given will clear things up for you.
Safe publication when values are read in synchronized methods
There is no happens-before relationship between the end of a constructor and method invocations, and as such it is possible for one thread to start constructing the instance and make the reference available and for another thread to acquire that reference and start calling the greet() method on a partially constructed object. The synchronization in greet() does not really address that issue.
If you publish an instance via the celebrated double-checked locking pattern, it becomes easier to see how. If there was such a happens-before relationship, it should have been safe even if DCLP is used.
public class Foo {
private boolean needsGreeting = true;
public synchronized void greet() {
if (needsGreeting) {
System.out.println("Hello.");
needsGreeting = false;
}
}
}
class FooUser {
private static Foo foo;
public static Foo getFoo() {
if (foo == null) {
synchronized (FooUser.class) {
if (foo == null) {
foo = new Foo();
}
}
}
return foo;
}
}
If multiple threads call FooUser.getFoo().greet() at the same time, one thread might be constructing the Foo instance, but another thread may find a non-null Foo reference prematurely, and call greet() and find needsGreeting is still false.
An example of this is mentioned in Java Concurrency in Practice (3.5).
Can a thread first acquire an object via safe publication and then publish it unsafely?
Answer: Causality part of the JMM allows Thread 3
to see o
as partially constructed.
I finally managed apply 17.4.8. Executions and Causality Requirements (aka the causality part of the JMM) to this example.
So this is our Java program:
class Obj1 {
int f1;
}
volatile Obj1 v1;
Obj1 v2;
Thread 1 | Thread 2 | Thread 3
--------------------|----------|-----------------
var o = new Obj1(); | |
o.f1 = 1; | |
v1 = o; | |
| v2 = v1; |
| | var r1 = v2.f1;
And we want to find out if the result (r1 == 0)
is allowed.
Turns out, to prove that (r1 == 0)
is allowed, we need to find a well-formed execution, which gives that result and can be validated with the algorithm given in 17.4.8. Executions and Causality Requirements.
First let's rewrite our Java program in terms of variables and actions as defined in the algorithm.
Let's also show the values for our read and write actions to get the execution E
we want to validate:
Initially: W[v1]=null, W[v2]=null, W[o.f1]=0
Thread 1 | Thread 2 | Thread 3
----------|----------|-----------
W[o.f1]=1 | |
Wv[v1]=o | |
| Rv[v1]=o |
| W[v2]=o |
| | R[v2]=o
| | R[o.f1]=0
Notes:
o
represents the instance created bynew Obj1();
in the java codeW
andR
represent normal writes and reads;Wv
andRv
represent volatile writes and reads- read/written value for the action is shown after
=
W[o.f1]=0
is in the initial actions because according to the JLS:The write of the default value (zero, false, or null) to each variable synchronizes-with the first action in every thread.
Although it may seem a little strange to write a default value to a variable before the object containing the variable is allocated, conceptually every object is created at the start of the program with its default initialized values.
Here is a more compact form of E
:
W[v1]=null, W[v2]=null, W[o.f1]=0
---------------------------------
W[o.f1]=1 | |
Wv[v1]=o | |
| Rv[v1]=o |
| W[v2]=o |
| | R[v2]=o
| | R[o.f1]=0
Validation of E
According to 17.4.8. Executions and Causality Requirements:
A well-formed execution E = < P, A, po, so, W, V, sw, hb > is validated by committing actions from A. If all of the actions in A can be committed, then the execution satisfies the causality requirements of the Java programming language memory model.
So we need to build step-by-step the set of committed actions (we get a sequence C₀,C₁,...
, where Cₖ
is the set of committed actions on the k-th iteration, and Cₖ ⊆ Cₖ₊₁
) until we commit all actions A
of our execution E
.
Also the JLS section contains 9 rules which define when an action can me committed.
Step 0: the algorithm always starts with an empty set.
C₀ = ∅
Step 1: we commit only writes.
The reason is that according to rule 7, a committed a read inСₖ
must return a write fromСₖ₋₁
, but we have emptyC₀
.E₁:
W[v1]=null, W[v2]=null, W[o.f1]=0
----------------------------------
W[o.f1]=1 | |
Wv[v1]=o | |
C₁ = { W[v1]=null, W[v2]=null, W[o.f1]=0, W[o.f1]=1, Wv[v1]=o }Step 2: now we can commit the read and the write of
o
in Thread 2.
Sincev1
is volatile,Wv[v1]=o
happens-beforeRv[v1]
, and the read returnso
.E₂:
W[v1]=null, W[v2]=null, W[o.f1]=0
---------------------------------
W[o.f1]=1 | |
Wv[v1]=o | |
| Rv[v1]=o |
| W[v2]=o |
C₂ = C₁∪{ Rv[v1]=o, W[v2]=o }Step 3: now the we have
W[v2]=o
committed, we can commit the readR[v2]
in Thread 3.
According to rule 6, a currently committed read can only return a happens-before write (the value can be changed once to a racy write on the next step).R[v2]
andW[v2]=o
are not ordered with happens-before, soR[v2]
readsnull
.E₃:
W[v1]=null, W[v2]=null, W[o.f1]=0
---------------------------------
W[o.f1]=1 | |
Wv[v1]=o | |
| Rv[v1]=o |
| W[v2]=o |
| | R[v2]=null
C₃ = C₂∪{ R[v2]=null }Step 4: now
R[v2]
can readW[v2]=o
through a data race, and it makesR[o.f1]
possible.R[o.f1]
reads the default value0
, and the algorithm finishes because all the actions of our execution are committed.E = E₄:
W[v1]=null, W[v2]=null, W[o.f1]=0
---------------------------------
W[o.f1]=1 | |
Wv[v1]=o | |
| Rv[v1]=o |
| W[v2]=o |
| | R[v2]=o
| | R[o.f1]=0
A = C₄ = C₂∪{ R[v2]=o, R[o.f1]=0 }
As a result, we validated an execution which produces (r1 == 0)
, therefore, this result is valid.
Also, it worth noting, that this causality validation algorithm adds almost no additional restrictions to happens-before.
Jeremy Manson (one of the JMM authors) explains that the algorithm exists to prevent a rather bizarre behavior — so called "causality loops" when there is a circular chain of actions which causes each other (i.e. when an action causes itself).
In every other case except for these causality loops we use happens-before like in the Tom's comment.
Ensuring safe publication and thread safety in java by means of static factories
Yes, instances of this class can be published unsafely. This class is not immutable, so if the instantiating thread makes an instance available to other threads without a memory barrier, those threads may see the instance in a partially constructed or otherwise inconsistent state.
The term you are looking for is effectively immutable: the instance fields could be modified after initialization, but in fact they are not.
Such objects can be used safely by multiple threads, but it all depends on how other threads get access to the instance (i.e., how they are published). If you put these objects on a concurrent queue to be consumed by another thread—no problem. If you assign them to a field visible to another thread in a synchronized block, and notify()
a wait()
-ing thread which reads them—no problem. If you create all the instances in one thread which then starts new threads that use them—no problem!
But if you just assign them to a non-volatile field and sometime "later" another thread happens to read that field, that's a problem! Both the writing thread and the reading thread need synchronization points so that the write truly can be said to have happened before the read.
Your code doesn't do any publication, so I can't say if you are doing it safely. You could ask the same question about this object:
class Option {
private boolean value;
Option(boolean value) { this.value = value; }
boolean get() { return value; }
}
If you are doing something "extra" in your code that you think would make a difference to the safe publication of your objects, please point it out.
Related Topics
How to Map a Composite Key with JPA and Hibernate
Why Is Java's Simpledateformat Not Thread-Safe
How to Deploy a Javafx 11 Desktop Application with a Jre
Number of Days Between Two Dates in Joda-Time
Why Integer Class Caching Values in the Range -128 to 127
How to Resolve a Java Rounding Double Issue
Can You Recommend a Java Library for Reading (And Possibly Writing) CSV Files
Difference in Days Between Two Dates in Java
Simple Http Server in Java Using Only Java Se API
How to Turn Off the Eclipse Code Formatter for Certain Sections of Java Code
Why Do We Usually Use || Over |? What Is the Difference
How Does a Arraylist's Contains() Method Evaluate Objects
Why Can't Overriding Methods Throw Exceptions Broader Than the Overridden Method
Nosuchelementexception with Java.Util.Scanner
Java Reading a File into an Arraylist
Why Can Outer Java Classes Access Inner Class Private Members