Is volatile bool for thread control considered wrong?
volatile
can be used for such purposes. However this is an extension to standard C++ by Microsoft:
Microsoft Specific
Objects declared as volatile are (...)
- A write to a volatile object (volatile write) has Release semantics; (...)
- A read of a volatile object (volatile read) has Acquire semantics; (...)
This allows volatile objects to be used for memory locks and releases in multithreaded applications.(emph. added)
That is, as far as I understand, when you use the Visual C++ compiler, a volatile bool
is for most practical purposes an atomic<bool>
.
It should be noted that newer VS versions add a /volatile switch that controls this behavior, so this only holds if /volatile:ms
is active.
Why is volatile not considered useful in multithreaded C or C++ programming?
The problem with volatile
in a multithreaded context is that it doesn't provide all the guarantees we need. It does have a few properties we need, but not all of them, so we can't rely on volatile
alone.
However, the primitives we'd have to use for the remaining properties also provide the ones that volatile
does, so it is effectively unnecessary.
For thread-safe accesses to shared data, we need a guarantee that:
- the read/write actually happens (that the compiler won't just store the value in a register instead and defer updating main memory until much later)
- that no reordering takes place. Assume that we use a
volatile
variable as a flag to indicate whether or not some data is ready to be read. In our code, we simply set the flag after preparing the data, so all looks fine. But what if the instructions are reordered so the flag is set first?
volatile
does guarantee the first point. It also guarantees that no reordering occurs between different volatile reads/writes. All volatile
memory accesses will occur in the order in which they're specified. That is all we need for what volatile
is intended for: manipulating I/O registers or memory-mapped hardware, but it doesn't help us in multithreaded code where the volatile
object is often only used to synchronize access to non-volatile data. Those accesses can still be reordered relative to the volatile
ones.
The solution to preventing reordering is to use a memory barrier, which indicates both to the compiler and the CPU that no memory access may be reordered across this point. Placing such barriers around our volatile variable access ensures that even non-volatile accesses won't be reordered across the volatile one, allowing us to write thread-safe code.
However, memory barriers also ensure that all pending reads/writes are executed when the barrier is reached, so it effectively gives us everything we need by itself, making volatile
unnecessary. We can just remove the volatile
qualifier entirely.
Since C++11, atomic variables (std::atomic<T>
) give us all of the relevant guarantees.
Volatile boolean vs AtomicBoolean
They are just totally different. Consider this example of a volatile
integer:
volatile int i = 0;
void incIBy5() {
i += 5;
}
If two threads call the function concurrently, i
might be 5 afterwards, since the compiled code will be somewhat similar to this (except you cannot synchronize on int
):
void incIBy5() {
int temp;
synchronized(i) { temp = i }
synchronized(i) { i = temp + 5 }
}
If a variable is volatile, every atomic access to it is synchronized, but it is not always obvious what actually qualifies as an atomic access. With an Atomic*
object, it is guaranteed that every method is "atomic".
Thus, if you use an AtomicInteger
and getAndAdd(int delta)
, you can be sure that the result will be 10
. In the same way, if two threads both negate a boolean
variable concurrently, with an AtomicBoolean
you can be sure it has the original value afterwards, with a volatile boolean
, you can't.
So whenever you have more than one thread modifying a field, you need to make it atomic or use explicit synchronization.
The purpose of volatile
is a different one. Consider this example
volatile boolean stop = false;
void loop() {
while (!stop) { ... }
}
void stop() { stop = true; }
If you have a thread running loop()
and another thread calling stop()
, you might run into an infinite loop if you omit volatile
, since the first thread might cache the value of stop. Here, the volatile
serves as a hint to the compiler to be a bit more careful with optimizations.
Is 'volatile' needed in this multi-threaded C++ code?
You should not depend on volatile to guarantee thread safety, this is because even though the compiler will guarantee that the the variable is always read from memory (and not a register cache), in multi-processor environments a memory barrier will also be required.
Rather use the correct lock around the shared memory. Locks like a Critical Section are often extremely lightweight and in a case of no contention will probably be all implemented userside. They will also contain the necessary memory barriers.
Volatile should only be used for memory mapped IO where multiple reads may return different values. Similarly for memory mapped writes.
Is a volatile boolean switch written to by only one thread thread-safe?
However this thread is the only thread that ever writes to the variable... Is the variable being volatile enough to make sure that all threads read the right value from it.
If there is only one writer then there is no race condition. You do not need to synchronize across the variable.
You might consider using an AtomicBoolean
but it does not support a toggle()
method so if you had multiple writers toggling the value, you would have to do something like the following:
private final AtomicBoolean isEvenTick = new AtomicBoolean();
...
boolean currentValue;
do {
currentValue = isEvenTick.get();
} while (!isEvenTick.compareAndSet(currentValue, !currentValue);
What is the volatile keyword useful for?
volatile
has semantics for memory visibility. Basically, the value of a volatile
field becomes visible to all readers (other threads in particular) after a write operation completes on it. Without volatile
, readers could see some non-updated value.
To answer your question: Yes, I use a volatile
variable to control whether some code continues a loop. The loop tests the volatile
value and continues if it is true
. The condition can be set to false
by calling a "stop" method. The loop sees false
and terminates when it tests the value after the stop method completes execution.
The book "Java Concurrency in Practice," which I highly recommend, gives a good explanation of volatile
. This book is written by the same person who wrote the IBM article that is referenced in the question (in fact, he cites his book at the bottom of that article). My use of volatile
is what his article calls the "pattern 1 status flag."
If you want to learn more about how volatile
works under the hood, read up on the Java memory model. If you want to go beyond that level, check out a good computer architecture book like Hennessy & Patterson and read about cache coherence and cache consistency.
Why is std::atomicbool much slower than volatile bool?
Code from "Olaf Dietsche"
USE ATOMIC
real 0m1.958s
user 0m1.957s
sys 0m0.000s
USE VOLATILE
real 0m1.966s
user 0m1.953s
sys 0m0.010s
IF YOU ARE USING GCC SMALLER 4.7
http://gcc.gnu.org/gcc-4.7/changes.html
Support for atomic operations specifying the C++11/C11 memory model has been added. These new __atomic routines replace the existing __sync built-in routines.
Atomic support is also available for memory blocks. Lock-free instructions will be used if a memory block is the same size and alignment as a supported integer type. Atomic operations which do not have lock-free support are left as function calls. A set of library functions is available on the GCC atomic wiki in the "External Atomics Library" section.
So yeah .. only solution is to upgrade to GCC 4.7
Volatile and CreateThread
What volatile
does:
- Prevents the compiler from optimizing out any access. Every read/write will result in a read/write instruction.
- Prevents the compiler from reordering the access with other volatiles.
What volatile
does not:
- Make the access atomic.
- Prevent the compiler from reordering with non-volatile accesses.
- Make changes from one thread visible in another thread.
Some non-portable behaviors that shouldn't be relied on in cross-platform C++:
- VC++ has extended
volatile
to prevent any reordering with other instructions. Other compilers don't, because it negatively affects optimization. - x86 makes aligned read/write of pointer-sized and smaller variables atomic, and immediately visible to other threads. Other architectures don't.
Most of the time, what people really want are fences (also called barriers) and atomic instructions, which are usable if you've got a C++11 compiler, or via compiler- and architecture-dependent functions otherwise.
Fences ensure that, at the point of use, all the previous reads/writes will be completed. In C++11, fences are controlled at various points using the std::memory_order
enumeration. In VC++ you can use _ReadBarrier()
, _WriteBarrier()
, and _ReadWriteBarrier()
to do this. I'm not sure about other compilers.
On some architectures like x86, a fence is merely a way to prevent the compiler from reordering instructions. On others they might actually emit an instruction to prevent the CPU itself from reordering things.
Here's an example of improper use:
int res1, res2;
volatile bool finished;
void work_thread(int a, int b)
{
res1 = a + b;
res2 = a - b;
finished = true;
}
void spinning_thread()
{
while(!finished); // spin wait for res to be set.
}
Here, finished
is allowed to be reordered to before either res
is set! Well, volatile prevents reordering with other volatile, right? Let's try making each res
volatile too:
volatile int res1, res2;
volatile bool finished;
void work_thread(int a, int b)
{
res1 = a + b;
res2 = a - b;
finished = true;
}
void spinning_thread()
{
while(!finished); // spin wait for res to be set.
}
This trivial example will actually work on x86, but it is going to be inefficient. For one, this forces res1
to be set before res2
, even though we don't really care about that... we just want both of them set before finished
is. Forcing this ordering between res1
and res2
will only prevent valid optimizations, eating away at performance.
For more complex problems, you'll have to make every write volatile
. This would bloat your code, be very error prone, and become slow as it prevents a lot more reordering than you really wanted.
It's not realistic. So we use fences and atomics. They allow full optimization, and only guarantee that the memory access will complete at the point of the fence:
int res1, res2;
std::atomic<bool> finished;
void work_thread(int a, int b)
{
res1 = a + b;
res2 = a - b;
finished.store(true, std::memory_order_release);
}
void spinning_thread()
{
while(!finished.load(std::memory_order_acquire));
}
This will work for all architectures. res1
and res2
operations can be reordered as the compiler sees fit. Performing an atomic release ensures that all non-atomic ops are ordered to complete and be visible to threads which perform an atomic acquire.
Related Topics
How Does _Builtin_Clear_Cache Work
How to Get the Position and Draw Rectangle Using Opencv
How to Solve the Error Lnk2019: Unresolved External Symbol - Function
How to Pretty-Print Stl Containers in Gdb
Source Code of C/C++ Functions
Reduce Flicker with Gdi+ and C++
Initializing a Union with a Non-Trivial Constructor
Why Destructor Is Not Called on Exception
Reason for C++ Member Function Hiding
How to Enable C++17 Support in VScode C++ Extension
Creating a Zip File on Windows (Xp/2003) in C/C++
Correctly Reading a Utf-16 Text File into a String Without External Libraries
Why Is There No Reallocation Functionality in C++ Allocators
How to Directly Bind a Member Function to an Std::Function in Visual Studio 11
"Constexpr If" VS "If" with Optimizations - Why Is "Constexpr" Needed
Get Index of a Tuple Element's Type
Undefined Behaviour Somewhere in Boost::Spirit::Qi::Phrase_Parse