In C/C++, are volatile variables guaranteed to have eventually consistent semantics betwen threads?
It's going to depend on your architecture. While it is unusual to require an explicit cache flush or memory sync to ensure memory writes are visible to other threads, nothing precludes it, and I've certainly encountered platforms (including the PowerPC-based device I am currently developing for) where explicit instructions have to be executed to ensure state is flushed.
Note that thread synchronisation primitives like mutexes will perform the necessary work as required, but you don't typically actually need a thread synchronisation primitive if all you want is to ensure the state is visible without caring about consistency - just the sync / flush instruction will suffice.
EDIT: To anyone still in confustion about the volatile
keyword - volatile
guarantees the compiler will not generate code that explicitly caches data in registers, but this is NOT the same thing as dealing with hardware that transparently caches / reorders reads and writes. Read e.g. this or this, or this Dr Dobbs article, or the answer to this SO question, or just pick your favourite compiler that targets a weakly consistent memory architecture like Cell, write some test code and compare what the compiler generates to what you'd need in order to ensure writes are visible to other processes.
Why is volatile not considered useful in multithreaded C or C++ programming?
The problem with volatile
in a multithreaded context is that it doesn't provide all the guarantees we need. It does have a few properties we need, but not all of them, so we can't rely on volatile
alone.
However, the primitives we'd have to use for the remaining properties also provide the ones that volatile
does, so it is effectively unnecessary.
For thread-safe accesses to shared data, we need a guarantee that:
- the read/write actually happens (that the compiler won't just store the value in a register instead and defer updating main memory until much later)
- that no reordering takes place. Assume that we use a
volatile
variable as a flag to indicate whether or not some data is ready to be read. In our code, we simply set the flag after preparing the data, so all looks fine. But what if the instructions are reordered so the flag is set first?
volatile
does guarantee the first point. It also guarantees that no reordering occurs between different volatile reads/writes. All volatile
memory accesses will occur in the order in which they're specified. That is all we need for what volatile
is intended for: manipulating I/O registers or memory-mapped hardware, but it doesn't help us in multithreaded code where the volatile
object is often only used to synchronize access to non-volatile data. Those accesses can still be reordered relative to the volatile
ones.
The solution to preventing reordering is to use a memory barrier, which indicates both to the compiler and the CPU that no memory access may be reordered across this point. Placing such barriers around our volatile variable access ensures that even non-volatile accesses won't be reordered across the volatile one, allowing us to write thread-safe code.
However, memory barriers also ensure that all pending reads/writes are executed when the barrier is reached, so it effectively gives us everything we need by itself, making volatile
unnecessary. We can just remove the volatile
qualifier entirely.
Since C++11, atomic variables (std::atomic<T>
) give us all of the relevant guarantees.
Does the C++ volatile keyword introduce a memory fence?
Rather than explaining what volatile
does, allow me to explain when you should use volatile
.
- When inside a signal handler. Because writing to a
volatile
variable is pretty much the only thing the standard allows you to do from within a signal handler. Since C++11 you can usestd::atomic
for that purpose, but only if the atomic is lock-free. - When dealing with
setjmp
according to Intel. - When dealing directly with hardware and you want to ensure that the compiler does not optimize your reads or writes away.
For example:
volatile int *foo = some_memory_mapped_device;
while (*foo)
; // wait until *foo turns false
Without the volatile
specifier, the compiler is allowed to completely optimize the loop away. The volatile
specifier tells the compiler that it may not assume that 2 subsequent reads return the same value.
Note that volatile
has nothing to do with threads. The above example does not work if there was a different thread writing to *foo
because there is no acquire operation involved.
In all other cases, usage of volatile
should be considered non-portable and not pass code review anymore except when dealing with pre-C++11 compilers and compiler extensions (such as msvc's /volatile:ms
switch, which is enabled by default under X86/I64).
When should the volatile keyword be used in C#?
I don't think there's a better person to answer this than Eric Lippert (emphasis in the original):
In C#, "volatile" means not only "make sure that the compiler and the
jitter do not perform any code reordering or register caching
optimizations on this variable". It also means "tell the processors to
do whatever it is they need to do to ensure that I am reading the
latest value, even if that means halting other processors and making
them synchronize main memory with their caches".Actually, that last bit is a lie. The true semantics of volatile reads
and writes are considerably more complex than I've outlined here; in
fact they do not actually guarantee that every processor stops what it
is doing and updates caches to/from main memory. Rather, they provide
weaker guarantees about how memory accesses before and after reads and
writes may be observed to be ordered with respect to each other.
Certain operations such as creating a new thread, entering a lock, or
using one of the Interlocked family of methods introduce stronger
guarantees about observation of ordering. If you want more details,
read sections 3.10 and 10.5.3 of the C# 4.0 specification.Frankly, I discourage you from ever making a volatile field. Volatile
fields are a sign that you are doing something downright crazy: you're
attempting to read and write the same value on two different threads
without putting a lock in place. Locks guarantee that memory read or
modified inside the lock is observed to be consistent, locks guarantee
that only one thread accesses a given chunk of memory at a time, and so
on. The number of situations in which a lock is too slow is very
small, and the probability that you are going to get the code wrong
because you don't understand the exact memory model is very large. I
don't attempt to write any low-lock code except for the most trivial
usages of Interlocked operations. I leave the usage of "volatile" to
real experts.
For further reading see:
- Understand the Impact of Low-Lock Techniques in Multithreaded Apps
- Sayonara volatile
Should InterlockedExchange be used on all setting of a variable?
Is it or is it not atomic to read/write a LONG in Windows without the interlocked functions?
Yes to both, assuming default alignment. This follows from the quoted statement "simple reads and writes to properly-aligned 32-bit variables are atomic operations", because LONG
is a 32-bit signed integer in Windows, regardless of the bitness of the OS.
Is it possible in this example that foo will not appear TRUE to either of the InterlockedExchange calls?
No, not possible if both calls are reached. That's because within a single thread the foo = TRUE;
write is guaranteed to be visible to the InterlockedExchange
call that comes after it. So the InterlockedExchange
call in main
will see either the TRUE
value previously set in main
, or the FALSE
value reset in the other
thread. Therefore, one of the InterlockedExchange
calls must read the foo
value as TRUE
.
However, if the /* do other things */
code in main
is an infinite loop while(1);
then that would leave only one InterlockedExchange
outstanding in other
, and it may be possible for that call to see foo
as FALSE
, for the same reason as...
If main is busy doing other things so that the first InterlockedExchange call is made from the other thread, does that mean foo is guaranteed to have been written by the main thread and visible to the other thread at that time?
Not necessarily. The foo = TRUE;
write is visible to the main
thread at the point the secondary thread is created, but may not necessarily be visible to the other
thread when it starts, or even when it gets to the InterlockedExchange
call.
What is the volatile keyword useful for?
volatile
has semantics for memory visibility. Basically, the value of a volatile
field becomes visible to all readers (other threads in particular) after a write operation completes on it. Without volatile
, readers could see some non-updated value.
To answer your question: Yes, I use a volatile
variable to control whether some code continues a loop. The loop tests the volatile
value and continues if it is true
. The condition can be set to false
by calling a "stop" method. The loop sees false
and terminates when it tests the value after the stop method completes execution.
The book "Java Concurrency in Practice," which I highly recommend, gives a good explanation of volatile
. This book is written by the same person who wrote the IBM article that is referenced in the question (in fact, he cites his book at the bottom of that article). My use of volatile
is what his article calls the "pattern 1 status flag."
If you want to learn more about how volatile
works under the hood, read up on the Java memory model. If you want to go beyond that level, check out a good computer architecture book like Hennessy & Patterson and read about cache coherence and cache consistency.
Related Topics
Under What Circumstances Is It Advantageous to Give an Implementation of a Pure Virtual Function
How to Add Code at the Entry of Every Function
Why Do You Need to Append an L or F After a Value Assigned to a C++ Constant
How to Use Pre-Compiled Headers in Vc++ Without Requiring Stdafx.H
Hash Function for User Defined Class. How to Make Friends?
Template Specialization for Multiple Types
Good C++ Array Class for Dealing with Large Arrays of Data in a Fast and Memory Efficient Way
How to Efficiently Wait for Cts or Dsr of Rs232 in Linux
Why Do My Sfinae Expressions No Longer Work with Gcc 8.2
C and C++ Functions Without a Return Statement
How to Convert a Tchar Array to Std::String
How Visitor Pattern Avoid Downcasting
Std::Set Iterator Automatically Const
Openmp and Reduction on Std::Vector
What's the Difference Between C and C++
What's the Difference Between Cout<<Cout and Cout<<&Cout in C++