Why we need Thread.MemoryBarrier()?
You are going to have a very hard time reproducing this bug. In fact, I would go as far as saying you will never be able to reproduce it using the .NET Framework. The reason is because Microsoft's implementation uses a strong memory model for writes. That means writes are treated as if they were volatile. A volatile write has lock-release semantics which means that all prior writes must be committed before the current write.
However, the ECMA specification has a weaker memory model. So it is theoretically possible that Mono or even a future version of the .NET Framework might start exhibiting the buggy behavior.
So what I am saying is that it is very unlikely that removing barriers #1 and #2 will have any impact on the behavior of the program. That, of course, is not a guarantee, but an observation based on the current implementation of the CLR only.
Removing barriers #3 and #4 will definitely have an impact. This is actually pretty easy to reproduce. Well, not this example per se, but the following code is one of the more well known demonstrations. It has to be compiled using the Release build and ran outside of the debugger. The bug is that the program does not end. You can fix the bug by placing a call to Thread.MemoryBarrier
inside the while
loop or by marking stop
as volatile
.
class Program
{
static bool stop = false;
public static void Main(string[] args)
{
var t = new Thread(() =>
{
Console.WriteLine("thread begin");
bool toggle = false;
while (!stop)
{
toggle = !toggle;
}
Console.WriteLine("thread end");
});
t.Start();
Thread.Sleep(1000);
stop = true;
Console.WriteLine("stop = true");
Console.WriteLine("waiting...");
t.Join();
}
}
The reason why some threading bugs are hard to reproduce is because the same tactics you use to simulate thread interleaving can actually fix the bug. Thread.Sleep
is the most notable example because it generates memory barriers. You can verify that by placing a call inside the while
loop and observing that the bug goes away.
You can see my answer here for another analysis of the example from the book you cited.
Why do I need a memory barrier?
Barrier #2 guarentees that the write to _complete
gets committed immediately. Otherwise it could remain in a queued state meaning that the read of _complete
in B
would not see the change caused by A
even though B
effectively used a volatile read.
Of course, this example does not quite do justice to the problem because A
does nothing more after writing to _complete
which means that the write will be comitted immediately anyway since the thread terminates early.
The answer to your question of whether the if
could still evaluate to false
is yes for exactly the reasons you stated. But, notice what the author says regarding this point.
Barriers 1 and 4 prevent this example
from writing “0”. Barriers 2 and 3
provide a freshness guarantee: they
ensure that if B ran after A, reading
_complete would evaluate to true.
The emphasis on "if B ran after A" is mine. It certainly could be the case that the two threads interleave. But, the author was ignoring this scenario presumably to make his point regarding how Thread.MemoryBarrier
works simpler.
By the way, I had a hard time contriving an example on my machine where barriers #1 and #2 would have altered the behavior of the program. This is because the memory model regarding writes was strong in my environment. Perhaps, if I had a multiprocessor machine, was using Mono, or had some other different setup I could have demonstrated it. Of course, it was easy to demonstrate that removing barriers #3 and #4 had an impact.
Need clarification about Thread.MemoryBarrier()
Memory barrier enforces ordering constraint on reads and writes from/to memory: memory access operations before the barrier happen-before the memory access after the barrier.
Barriers 1 and 4 have complementary roles: barrier 1 ensures that the write to
_answer
happens-before the write to_complete
, while barrier 4 ensures that the read from_complete
happens-before the read from_answer
. Imagine barrier 4 isn't there, but barrier 1 is. While it is guaranteed that123
is written to_answer
beforetrue
is written to_complete
some other thread runningB()
may still have its read operations reordered and hence it may read_answer
before it reads_complete
. Similarly if barrier 1 is removed with barrier 4 kept: while the read from_complete
inB()
will always happen-before the read from_answer
,_complete
could still be written to before_answer
by some other thread runningA()
.Barriers 2 and 3 provide freshness guarantee: if barrier 3 is executed after barrier 2 then the state visible to the thread running
A()
at the point when it executes barrier 2 becomes visible to the thread runningB()
at the point when it executes barrier 3. In the absence of any of these two barriersB()
executing afterA()
completed might not see the changes made byA()
. In particular barrier 2 prevents the value written to_complete
from being cached by the processor runningA()
and forces the processor to write it out to the main memory. Similarly, barrier 3 prevents the processor runningB()
from relying on cache for the value of_complete
forcing a read from the main memory. Note however that stale cache isn't the only thing which can prevent freshness guarantee in the absence of memory barriers 2 and 3. Reordering of operations on the memory bus is another example of such mechanism.Memory barrier just ensures that the effects of memory access operations are ordered across the barrier. Other instructions (e.g. increment a value in a register) may still be reordered.
When to use 'volatile' or 'Thread.MemoryBarrier()' in threadsafe locking code? (C#)
You use volatile
/Thread.MemoryBarrier()
when you want to access a variable across threads without locking.
Variables that are atomic, like an int
for example, are always read and written whole at once. That means that you will never get half of the value before another thread changes it and the other half after it has changed. Because of that you can safely read and write the value in different threads without syncronising.
However, the compiler may optimize away some reads and writes, which you prevent with the volatile
keyword. If you for example have a loop like this:
sum = 0;
foreach (int value in list) {
sum += value;
}
The compiler may actually do the calculations in a processor register and only write the value to the sum
variable after the loop. If you make the sum
variable volatile
, the compiler will generate code that reads and writes the variable for every change, so that it's value is up to date throughout the loop.
Do memory barriers guarantee a fresh read in C#?
It is not guaranteed to see both threads write 1
. It only guarantees the order of read/write operations based on this rule:
The processor executing the current thread cannot reorder instructions in such a way that memory accesses prior to the call to
MemoryBarrier
execute after memory accesses that follow the call toMemoryBarrier
.
So this basically means that the thread for a thread A
wouldn't use a value for the variable b
read before the barrier's call. But it still cache the value if your code is something like this:
void A() // runs in thread A
{
a = 1;
Thread.MemoryBarrier();
// b may be cached here
// some work here
// b is changed by other thread
// old value of b is being written
Console.WriteLine(b);
}
The race-condition bugs for a the parallel execution is very hard to reproduce, so I can't provide you a code that will definitely do the scenario above, but I suggest you to use the volatile
keyword for the variables being used by different threads, as it works exactly as you want - gives you a fresh read for a variable:
volatile int a = 0;
volatile int b = 0;
void A() // runs in thread A
{
a = 1;
Thread.MemoryBarrier();
Console.WriteLine(b);
}
void B() // runs in thread B
{
b = 1;
Thread.MemoryBarrier();
Console.WriteLine(a);
}
Is this a correct use of Thread.MemoryBarrier()?
Is this a correct use of Thread.MemoryBarrier()?
No. Suppose one thread sets the flag before the loop even begins to execute. The loop could still execute once, using a cached value of the flag. Is that correct? It certainly seems incorrect to me. I would expect that if I set the flag before the first execution of the loop, that the loop executes zero times, not once.
As far as I understand Thread.MemoryBarrier(), having this call inside the while loop will prevent my work thread from getting a cached version of the shouldRun, and effectively preventing an infinite loop from happening. Is my understanding about Thread.MemoryBarrier correct?
The memory barrier will ensure that the processor does not do any reorderings of reads and writes such that a memory access that is logically before the barrier is actually observed to be after it, and vice versa.
If you are hell bent on doing low-lock code, I would be inclined to make the field volatile rather than introducing an explicit memory barrier. "volatile" is a feature of the C# language. A dangerous and poorly understood feature, but a feature of the language. It clearly communicates to the reader of the code that the field in question is going to be used without locks on multiple threads.
is this a reasonable way to ensure that my loop will stop once shouldRun is set to false by any thread?
Some people would consider it reasonable. I would not do this in my own code without a very, very good reason.
Typically low-lock techniques are justified by performance considerations. There are two such considerations:
First, a contended lock is potentially extremely slow; it blocks as long as there is code executing in the lock. If you have a performance problem because there is too much contention then I would first try to solve the problem by eliminating the contention. Only if I could not eliminate the contention would I go to a low-lock technique.
Second, it might be that an uncontended lock is too slow. If the "work" you are doing in the loop takes, say, less that 200 nanoseconds then the time required to check the uncontended lock -- about 20 ns -- is a significant fraction of the time spent doing work. In that case I would suggest that you do more work per loop. Is it really necessary that the loop stops within 200 ns of the control flag being set?
Only in the most extreme of performance scenarios would I imagine that the cost of checking an uncontended lock is a significant fraction of the time spent in the program.
And also, of course, if you are inducing a memory barrier every 200 ns or so, you are also possibly wrecking performance in other ways. The processor wants to make those moving-memory-accesses-around-in-time optimizations for you; if you are forcing it to constantly abandon those optimizations, you're missing out on a potential win.
Explanation of Thread.MemoryBarrier() Bug with OoOP
It doesn't fix any issues. It's a fake fix, rather dangerous in production code, as it may work, or it may not work.
The core problem is in this line
static bool stop = false;
The variable that stops a while
loop is not volatile. Which means it may or may not be read from memory all the time. It can be cached, so that only the last read value is presented to a system (which may not be the actual current value).
This code
// Thread.MemoryBarrier() or Console.WriteLine() fixes issue
May or may not fix an issue on different platforms. Memory barrier or console write just happen to force application to read fresh values on a particular system. It may not be the same elsewhere.
Additionally, volatile
and Thread.MemoryBarrier()
only provide weak guarantees, which means they don't provide 100% assurance that a read value will always be the latest on all systems and CPUs.
Eric Lippert says
The true semantics of volatile reads
and writes are considerably more complex than I've outlined here; in
fact they do not actually guarantee that every processor stops what it
is doing and updates caches to/from main memory. Rather, they provide
weaker guarantees about how memory accesses before and after reads and
writes may be observed to be ordered with respect to each other.
Certain operations such as creating a new thread, entering a lock, or
using one of the Interlocked family of methods introduce stronger
guarantees about observation of ordering. If you want more details,
read sections 3.10 and 10.5.3 of the C# 4.0 specification.
Not understanding the purpose of memory barriers in C#
Currently, I understand the problem without memory barriers is that
there's a possibility that B will run before A and B will print
nothing because _complete could be evaluated as false.
No, the problem is in compiler, jitter or CPU instruction reordering.
It can be case, when some of them could reorder
_answer = 123;
_complete = true;
instructions for some optimization as form point of view single threaded application there is no matter order of them.
Now suppose they are reordered as
_complete = true;
_answer = 123;
now:
- Thread 1 set
_complete = true
- Thread 2 get
_complete
- evaluate
if
condition - get _answer (which is 0)
- Console.WriteLine(_answer) ->0
- evaluate
- Thread 1 set
_answer = 123
The code logic broken.
VB.NET: Do I need to call Thread.MemoryBarrier() before each read if I always complete my writes with Thread.MemoryBarrier()?
You can't remove the barrier on the read-side which is easy to show by example. Let's use this reader:
while (!IsDisposed); //reads _isDisposed
The value of _isDisposed
can clearly be cached in a register here so that new writes will never become visible. This loop could be infinite (for example - other effects are possible such as long delays).
More formally, the reads of _isDisposed
can all move "upwards" in time to appear to run before the store happens. volatile
stores effect a release fence meaning that nothing can move over them later in time. Things can move over them to previous points in time, though.
Use the Volatile
class. Or, use a struct written in C# as a wrapper around the field:
struct VolatileInt32Box { public volatile int Value; }
Thread.MemoryBarrier and lock difference for a simple property
is there any difference regarding thread-safeness?
Both ensure that appropriate barriers are set up around the read and write.
result?
In both cases two threads can race to write a value. However, reads and writes cannot move forwards or backwards in time past either the lock or the full fences.
performance?
You've written the code both ways. Now run it. If you want to know which is faster, run it and find out! If you have two horses and you want to know which is faster, race them. Don't ask strangers on the Internet which horse they think is faster.
That said, a better technique is set a performance goal, write the code to be clearly correct, and then test to see if you met your goal. If you did, don't waste your valuable time trying to optimize further code that is already fast enough; spend it optimizing something else that isn't fast enough.
A question you didn't ask:
What would you do?
I'd not write a multithreaded program, that's what I'd do. I'd use processes as my unit of concurrency if I had to.
If I had to write a multithreaded program then I would use the highest-level tool available. I'd use the Task Parallel Library, I'd use async-await, I'd use Lazy<T>
and so on. I'd avoid shared memory; I'd treat threads as lightweight processes that returned a value asynchronously.
If I had to write a shared-memory multithreaded program then I would lock everything, all the time. We routinely write programs these days that fetch a billion bytes of video over a satellite link and send it to a phone. Twenty nanoseconds spent taking a lock isn't going to kill you.
I am not smart enough to try to write low-lock code, so I wouldn't do that at all. If I had to then I would use that low-lock code to build a higher-level abstraction and use that abstraction. Fortunately I don't have to because someone already has built the abstractions I need.
Related Topics
"Could Not Load Type [Namespace].Global" Causing Me Grief
Cannot Close Excel.Exe After Interop Process
Convert Array of Bytes to Bitmapimage
Why Was "Switchto" Removed from Async Ctp/Release
Password Masking Console Application
What Represents a Double in SQL Server
Does the Use of the "Async" Suffix in a Method Name Depend on Whether the 'Async' Modifier Is Used
How to Instantiate a Dbcontext in Ef Core
What Is Point of Ssl If Fiddler 2 Can Decrypt All Calls Over Https
Regex to Get Number Only from String
Linq to Entities Only Supports Casting Edm Primitive or Enumeration Types with Ientity Interface
How to Convert Datatable to JSON String Using JSON.Net
How to Split a String by Strings and Include the Delimiters Using .Net
An Attempt Was Made to Access a Socket in a Way Forbidden by Its Access Permissions
Accessing Google Spreadsheets with C# Using Google Data API