In C/C++, Are Volatile Variables Guaranteed to Have Eventually Consistent Semantics Betwen Threads

In C/C++, are volatile variables guaranteed to have eventually consistent semantics betwen threads?

It's going to depend on your architecture. While it is unusual to require an explicit cache flush or memory sync to ensure memory writes are visible to other threads, nothing precludes it, and I've certainly encountered platforms (including the PowerPC-based device I am currently developing for) where explicit instructions have to be executed to ensure state is flushed.

Note that thread synchronisation primitives like mutexes will perform the necessary work as required, but you don't typically actually need a thread synchronisation primitive if all you want is to ensure the state is visible without caring about consistency - just the sync / flush instruction will suffice.

EDIT: To anyone still in confustion about the volatile keyword - volatile guarantees the compiler will not generate code that explicitly caches data in registers, but this is NOT the same thing as dealing with hardware that transparently caches / reorders reads and writes. Read e.g. this or this, or this Dr Dobbs article, or the answer to this SO question, or just pick your favourite compiler that targets a weakly consistent memory architecture like Cell, write some test code and compare what the compiler generates to what you'd need in order to ensure writes are visible to other processes.

Why is volatile not considered useful in multithreaded C or C++ programming?

The problem with volatile in a multithreaded context is that it doesn't provide all the guarantees we need. It does have a few properties we need, but not all of them, so we can't rely on volatile alone.

However, the primitives we'd have to use for the remaining properties also provide the ones that volatile does, so it is effectively unnecessary.

For thread-safe accesses to shared data, we need a guarantee that:

the read/write actually happens (that the compiler won't just store the value in a register instead and defer updating main memory until much later)
that no reordering takes place. Assume that we use a volatile variable as a flag to indicate whether or not some data is ready to be read. In our code, we simply set the flag after preparing the data, so all looks fine. But what if the instructions are reordered so the flag is set first?

volatile does guarantee the first point. It also guarantees that no reordering occurs between different volatile reads/writes. All volatile memory accesses will occur in the order in which they're specified. That is all we need for what volatile is intended for: manipulating I/O registers or memory-mapped hardware, but it doesn't help us in multithreaded code where the volatile object is often only used to synchronize access to non-volatile data. Those accesses can still be reordered relative to the volatile ones.

The solution to preventing reordering is to use a memory barrier, which indicates both to the compiler and the CPU that no memory access may be reordered across this point. Placing such barriers around our volatile variable access ensures that even non-volatile accesses won't be reordered across the volatile one, allowing us to write thread-safe code.

However, memory barriers also ensure that all pending reads/writes are executed when the barrier is reached, so it effectively gives us everything we need by itself, making volatile unnecessary. We can just remove the volatile qualifier entirely.

Since C++11, atomic variables (std::atomic<T>) give us all of the relevant guarantees.

Does the C++ volatile keyword introduce a memory fence?

Rather than explaining what volatile does, allow me to explain when you should use volatile.

When inside a signal handler. Because writing to a volatile variable is pretty much the only thing the standard allows you to do from within a signal handler. Since C++11 you can use std::atomic for that purpose, but only if the atomic is lock-free.
When dealing with setjmp according to Intel.
When dealing directly with hardware and you want to ensure that the compiler does not optimize your reads or writes away.

For example:

volatile int *foo = some_memory_mapped_device;
while (*foo)
    ; // wait until *foo turns false

Without the volatile specifier, the compiler is allowed to completely optimize the loop away. The volatile specifier tells the compiler that it may not assume that 2 subsequent reads return the same value.

Note that volatile has nothing to do with threads. The above example does not work if there was a different thread writing to *foo because there is no acquire operation involved.

In all other cases, usage of volatile should be considered non-portable and not pass code review anymore except when dealing with pre-C++11 compilers and compiler extensions (such as msvc's /volatile:ms switch, which is enabled by default under X86/I64).

When should the volatile keyword be used in C#?

I don't think there's a better person to answer this than Eric Lippert (emphasis in the original):

In C#, "volatile" means not only "make sure that the compiler and the
jitter do not perform any code reordering or register caching
optimizations on this variable". It also means "tell the processors to
do whatever it is they need to do to ensure that I am reading the
latest value, even if that means halting other processors and making
them synchronize main memory with their caches".

Actually, that last bit is a lie. The true semantics of volatile reads
and writes are considerably more complex than I've outlined here; in
fact they do not actually guarantee that every processor stops what it
is doing and updates caches to/from main memory. Rather, they provide
weaker guarantees about how memory accesses before and after reads and
writes may be observed to be ordered with respect to each other.
Certain operations such as creating a new thread, entering a lock, or
using one of the Interlocked family of methods introduce stronger
guarantees about observation of ordering. If you want more details,
read sections 3.10 and 10.5.3 of the C# 4.0 specification.

Frankly, I discourage you from ever making a volatile field. Volatile
fields are a sign that you are doing something downright crazy: you're
attempting to read and write the same value on two different threads
without putting a lock in place. Locks guarantee that memory read or
modified inside the lock is observed to be consistent, locks guarantee
that only one thread accesses a given chunk of memory at a time, and so
on. The number of situations in which a lock is too slow is very
small, and the probability that you are going to get the code wrong
because you don't understand the exact memory model is very large. I
don't attempt to write any low-lock code except for the most trivial
usages of Interlocked operations. I leave the usage of "volatile" to
real experts.

For further reading see:

Understand the Impact of Low-Lock Techniques in Multithreaded Apps
Sayonara volatile

Should InterlockedExchange be used on all setting of a variable?

Is it or is it not atomic to read/write a LONG in Windows without the interlocked functions?

Yes to both, assuming default alignment. This follows from the quoted statement "simple reads and writes to properly-aligned 32-bit variables are atomic operations", because LONG is a 32-bit signed integer in Windows, regardless of the bitness of the OS.

Is it possible in this example that foo will not appear TRUE to either of the InterlockedExchange calls?

No, not possible if both calls are reached. That's because within a single thread the foo = TRUE; write is guaranteed to be visible to the InterlockedExchange call that comes after it. So the InterlockedExchange call in main will see either the TRUE value previously set in main, or the FALSE value reset in the other thread. Therefore, one of the InterlockedExchange calls must read the foo value as TRUE.

However, if the /* do other things */ code in main is an infinite loop while(1); then that would leave only one InterlockedExchange outstanding in other, and it may be possible for that call to see foo as FALSE, for the same reason as...

If main is busy doing other things so that the first InterlockedExchange call is made from the other thread, does that mean foo is guaranteed to have been written by the main thread and visible to the other thread at that time?

Not necessarily. The foo = TRUE; write is visible to the main thread at the point the secondary thread is created, but may not necessarily be visible to the other thread when it starts, or even when it gets to the InterlockedExchange call.

What is the volatile keyword useful for?

volatile has semantics for memory visibility. Basically, the value of a volatile field becomes visible to all readers (other threads in particular) after a write operation completes on it. Without volatile, readers could see some non-updated value.

To answer your question: Yes, I use a volatile variable to control whether some code continues a loop. The loop tests the volatile value and continues if it is true. The condition can be set to false by calling a "stop" method. The loop sees false and terminates when it tests the value after the stop method completes execution.

The book "Java Concurrency in Practice," which I highly recommend, gives a good explanation of volatile. This book is written by the same person who wrote the IBM article that is referenced in the question (in fact, he cites his book at the bottom of that article). My use of volatile is what his article calls the "pattern 1 status flag."

If you want to learn more about how volatile works under the hood, read up on the Java memory model. If you want to go beyond that level, check out a good computer architecture book like Hennessy & Patterson and read about cache coherence and cache consistency.

In C/C++, Are Volatile Variables Guaranteed to Have Eventually Consistent Semantics Betwen Threads