Volatile Vs. Interlocked Vs. Lock

Volatile vs. Interlocked vs. lock

Worst (won't actually work)

Change the access modifier of counter to public volatile

As other people have mentioned, this on its own isn't actually safe at all. The point of volatile is that multiple threads running on multiple CPUs can and will cache data and re-order instructions.

If it is not volatile, and CPU A increments a value, then CPU B may not actually see that incremented value until some time later, which may cause problems.

If it is volatile, this just ensures the two CPUs see the same data at the same time. It doesn't stop them at all from interleaving their reads and write operations which is the problem you are trying to avoid.

Second Best:

lock(this.locker) this.counter++;

This is safe to do (provided you remember to lock everywhere else that you access this.counter). It prevents any other threads from executing any other code which is guarded by locker.
Using locks also, prevents the multi-CPU reordering problems as above, which is great.

The problem is, locking is slow, and if you re-use the locker in some other place which is not really related then you can end up blocking your other threads for no reason.

Best

Interlocked.Increment(ref this.counter);

This is safe, as it effectively does the read, increment, and write in 'one hit' which can't be interrupted. Because of this, it won't affect any other code, and you don't need to remember to lock elsewhere either. It's also very fast (as MSDN says, on modern CPUs, this is often literally a single CPU instruction).

I'm not entirely sure however if it gets around other CPUs reordering things, or if you also need to combine volatile with the increment.

InterlockedNotes:

  1. INTERLOCKED METHODS ARE CONCURRENTLY SAFE ON ANY NUMBER OF COREs OR CPUs.
  2. Interlocked methods apply a full fence around instructions they execute, so reordering does not happen.
  3. Interlocked methods do not need or even do not support access to a volatile field, as volatile is placed a half fence around operations on given field and interlocked is using the full fence.

Footnote: What volatile is actually good for.

As volatile doesn't prevent these kinds of multithreading issues, what's it for? A good example is saying you have two threads, one which always writes to a variable (say queueLength), and one which always reads from that same variable.

If queueLength is not volatile, thread A may write five times, but thread B may see those writes as being delayed (or even potentially in the wrong order).

A solution would be to lock, but you could also use volatile in this situation. This would ensure that thread B will always see the most up-to-date thing that thread A has written. Note however that this logic only works if you have writers who never read, and readers who never write, and if the thing you're writing is an atomic value. As soon as you do a single read-modify-write, you need to go to Interlocked operations or use a Lock.

c# - Volatile keyword usage vs lock

I've used volatile where I'm not sure it is necessary.

Let me be very clear on this point:

If you are not 100% clear on what volatile means in C# then do not use it. It is a sharp tool that is meant to be used by experts only. If you cannot describe what all the possible reorderings of memory accesses are allowed by a weak memory model architecture when two threads are reading and writing two different volatile fields then you do not know enough to use volatile safely and you will make mistakes, as you have done here, and write a program that is extremely brittle.

I was pretty sure a lock would be overkill in my situation

First off, the best solution is to simply not go there. If you don't write multithreaded code that tries to share memory then you don't have to worry about locking, which is hard to get correct.

If you must write multithreaded code that shares memory, then the best practice is to always use locks. Locks are almost never overkill. The price of an uncontended lock is on the order of ten nanoseconds. Are you really telling me that ten extra nanoseconds will make a difference to your user? If so, then you have a very, very fast program and a user with unusually high standards.

The price of a contended lock is of course arbitrarily high if the code inside the lock is expensive. Do not do expensive work inside a lock, so that the probability of contention is low.

Only when you have a demonstrated performance problem with locks that cannot be solved by removing contention should you even begin to consider a low-lock solution.

I added "volatile" to make sure that there is no misalignment occurring: reading only 32 bits of the variable and the other 32 bits on another fetch which can be broken in two by a write in the middle from another thread.

This sentence tells me that you need to stop writing multithreaded code right now. Multithreaded code, particularly low-lock code, is for experts only. You have to understand how the system actually works before you start writing multithreaded code again. Get a good book on the subject and study hard.

Your sentence is nonsensical because:

First off, integers already are only 32 bits.

Second, int accesses are guaranteed by the specification to be atomic! If you want atomicity, you've already got it.

Third, yes, it is true that volatile accesses are always atomic, but that is not because C# makes all volatile accesses into atomic accesses! Rather, C# makes it illegal to put volatile on a field unless the field is already atomic.

Fourth, the purpose of volatile is to prevent the C# compiler, jitter and CPU from making certain optimizations that would change the meaning of your program in a weak memory model. Volatile in particular does not make ++ atomic. (I work for a company that makes static analyzers; I will use your code as a test case for our "incorrect non-atomic operation on volatile field" checker. It is very helpful to me to get real-world code that is full of realistic mistakes; we want to make sure that we are actually finding the bugs that people write, so thanks for posting this.)

Looking at your actual code: volatile is, as Hans pointed out, totally inadequate to make your code correct. The best thing to do is what I said before: do not allow these methods to be called on any thread other than the main thread. That the counter logic is wrong should be the least of your worries. What makes the serialization thread safe if code on another thread is modifying the fields of the object while it is being serialized? That is the problem you should be worried about first.

Volatile vs Interlocked for reads and writes in C#

volatile guarantees that accesses are visible to other cores, as does Interlocked. The difference with interlocked is that it uses a full memory barrier for its guarantees and handles non-atomic operations. Volatile may not use a full memory barrier (depends on platform, e.g. x86/x64 don't need a full memory barrier with volatile...) but only makes atomic operations "thread-safe".

It's generally recommended to avoid volatile because it makes every atomic access to that variable "volatile" (which may not be that big a deal on x86/x64) and kinda hides the fact that access to the variable is different. Something like Interlocked is generally more recommended because it explicitly details the thread-safety concerns on each use of the variable.

Also, you can't use volatile on local variables, so if you wanted to use local variables with multiple threads, then Interlocked may be necessary. For example:

static void Main()
{
bool complete = false;
var t = new Thread (() =>
{
bool toggle = false;
while (!complete) toggle = !toggle;
});
t.Start();
Thread.Sleep (1000);
complete = true;
t.Join(); // Blocks indefinitely
}

Update: To be clear, by "accesses" I mean already-atomic accesses. It should be blatantly obvious that just because a variable is "volatile" doesn't make every single operation on it thread-safe. That's not what I'm saying. e.g. on some platforms x++ is not thread-safe despite using volatile.

Interlocked and volatile

You can safely disregard that warning when you're using Interlocked.Xxx functions (see this question), because they always do volatile operations. So a volatile variable is perfectly OK for shared state. If you want to get rid of the warning at all costs, you actually can do an interlocked read with Interlocked.CompareExchange (ref counter, 0, 0).

Edit: Actually, you need volatile on your state variable only if you are going to write to it directly (i.e. not using Interlocked.Xxx). As jerryjvl mentioned, reads of a variable updated with an interlocked (or volatile) operation will use the most recent value.

lock vs Interlocked.Exchange

Yes, you are better off using Interlocked as it's more efficient since it's mostly translated into a single atomic operation.

However, if you don't mind the clients still reading the old object for a bit you can even do without the Interlocked and just set a new instance.

The client that happen to get the new instance will get the updated data and those that don't will get it in one of the next checks.

Why declare a variable as volatile and use Interlocked on it at the same time?

I'll give my two cents, without being 100% sure that what I am about to say is correct. Joe Duffy is a world-class expert in multithreading, but I think that in this implementation he has been overly cautious regarding the cross-thread visibility of the internal state of the LockFreeStack<T> class. The volatility of the Node.m_next field is redundant in my opinion. The Node instances are mutable, but they are only mutated before they are linked in the internal linked list. After that phase they are practically immutable. That mutable phase is performed by a single thread, so there is no chance that another thread may see a stale version of a Node instance.

That leaves only the possibility of instructions re-ordering, as a reason for declaring the Node.m_next as volatile. Since the n.m_next = h; assignement is sandwiched between reading another volatile field (m_head), and an Interlocked.CompareExchange operation, I would assume that a possible re-ordering of the instructions that would compromise the correctness of the implementation is already prevented, but as I said I am not 100% sure.

I am pretty sure though that the implementation of the LockFreeStack<T> class could be improved performance-wise, at the cost of becoming slightly more allocatey, by making immutable the Node class. Nowadays (C# 9) this can be achieved simply by switching from type class to type record. This is how such an implementation would look like:

class LockFreeStack<T>
{
record Node(T Value, Node Next) { }

Node _head;

void Push(T value)
{
Node h = Volatile.Read(ref _head);
while (true)
{
Node n = new Node(value, h);
var original = Interlocked.CompareExchange(ref _head, n, h);
if (original == h) break;
h = original;
}
}

T Pop()
{
Node h = Volatile.Read(ref _head);
while (true)
{
if (h == null) throw new Exception("Stack empty");
var original = Interlocked.CompareExchange(ref _head, h.Next, h);
if (original == h) break;
h = original;
}
return h.Value;
}
}

Notice that the cost of volatility is incurred only once for each Push or Pop operation. One could even argue that the Volatile.Read of the _head field could be omitted altogether, since a possible stale _head value would be corrected anyway by the first Interlocked.CompareExchange operation, costing only an extra iteration of the while (true) loop.

Is there any advantage of using volatile keyword in contrast to use the Interlocked class?

EDIT: question largely rewritten

To answer this question, I dived a bit further in the matter and found out a few things about volatile and Interlocked that I wasn't aware of. Let's clear that out, not only for me, but for this discussion and other people reading up on this:

  • volatile read/write are supposed to be immune to reordering. This only means reading and writing, it does not mean any other action;
  • volatility is not forced on the CPU, i.e., hardware level (x86 uses acquire and release fences on any read/write). It does prevent compiler or CLR optimizations;
  • Interlocked uses atomic assembly instructions for CompareExchange (cmpxchg), Increment (inc) etc;
  • Interlocked does use a lock sometimes: a hardware lock on multi processor systems; in uni-processor systems, there is no hardware lock;
  • Interlocked is different from volatile in that it uses a full fence, where volatile uses a half fence.
  • A read following a write can be reordered when you use volatile. It can't happen with Interlocked. VolatileRead and VolatileWrite have the same reordering issue as `volatile (link thanks to Brian Gideon).

Now that we have the rules, we can define an answer to your question:

  • Technically: yes, there are things you can do with volatile that you cannot do with Interlocked:

    1. Syntax: you cannot write a = b where a or b is volatile, but this is obvious;
    2. You can read a different value after you write it to a volatile variable because of reordering. You cannot do this with Interlocked. In other words: you can be less safe with volatile then you can be with Interlocked.
    3. Performance: volatile is faster then Interlocked.
  • Semantically: no, because Interlocked simply provides a superset of operations and is safer to use because it applies full fencing. You can't do anything with volatile that you cannot do with Interlocked and you can do a lot with Interlocked that you cannot do with volatile:

    static volatile int x = 0;
    x++; // non-atomic
    static int y = 0;
    Interlocked.Increment(y); // atomic
  • Scope: yes, declaring a variable volatile makes it volatile for every single access. It is impossible to force this behavior any other way, hence volatile cannot be replaced with Interlocked. This is needed in scenarios where other libraries, interfaces or hardware can access your variable and update it anytime, or need the most recent version.

If you'd ask me, this last bit is the actual real need for volatile and may make it ideal where two processes share memory and need to read or write without locking. Declaring a variable as volatile is much safer in this context then forcing all programmers to use Interlocked (which you cannot force by the compiler).


EDIT: The following quote was part of my original answer, I'll leave it in ;-)

A quote from the the C# Programming Language standard:

For nonvolatile fields,optimization
techniques that consider that reorder
instructions can lead to unexpected
and unpredictable results in
multithreaded programs that access
fields without synchronization such as
that provided by the lock-statement.
These optimizationscan be performed by
the compiler, by the runtime system,
or by hardware. For volatile fields,
such reordering optimizations are
restricted:

  • A read of a volatile field is called a volatile read. A volatile read
    has :acquire semantics"; that is, it
    is guaranteed to occur prior to any
    references to memory that occur after
    it in the instruction sequence.

  • A write of a volatile field is called a volatile write. A
    volatile write has "release
    semantics"; that is, it is guaranteed
    to happen after any memory references
    prior to the write instruction in the
    instruction sequence.

Update: question largely rewritten, corrected my original response and added a "real" answer

Why is InterLocked is slower than lock?

Your test is flawed for the reason mentioned by Jakob Olsen. Additionally, your test is also including the overhead of calling methods in a class (for the lock, obviously that can't be avoided for calling Interlocked.Increment().

You should start all the threads and arrange for them to start working after you've started the stopwatch. You can do that by making them wait on a ManualResetEvent.

I've rewritten your test code like so:

using System;
using System.Diagnostics;
using System.Threading;

namespace Demo
{
static class Program
{
private static readonly object _objLocker = new object();
private static long _shared;
private static ManualResetEvent _signal = new ManualResetEvent(false);

static void Main(string[] args)
{
int numofthreads = 100;
TestInterLocked(numofthreads);
TestLocked(numofthreads);
Console.ReadLine();
}

private static void TestInterLocked(int numofthreads)
{
Thread[] threads = new Thread[numofthreads];
for (int i = 0; i < numofthreads; i++)
{
Thread t = new Thread(StartTestInterLocked);
threads[i] = t;
t.Start();
}
Thread.Sleep(5000); // Make sure threads have had time to start.
Stopwatch sw = new Stopwatch();
sw.Start();
_shared = 0;
_signal.Set();

for (int i = 0; i < threads.Length; i++)
{
threads[i].Join();
}

sw.Stop();
_signal.Reset();
Console.WriteLine($"Interlocked finished in : {sw.ElapsedMilliseconds}, value = {_shared}");
}

private static void TestLocked(int numofthreads)
{
Thread[] threads = new Thread[numofthreads];
for (int i = 0; i < numofthreads; i++)
{
Thread t = new Thread(StartTestLocked);
threads[i] = t;
t.Start();
}
Thread.Sleep(5000); // Make sure threads have had time to start.
Stopwatch sw = new Stopwatch();
sw.Start();
_shared = 0;
_signal.Set();

for (int i = 0; i < threads.Length; i++)
{
threads[i].Join();
}

sw.Stop();
_signal.Reset();
Console.WriteLine($"Locked finished in : {sw.ElapsedMilliseconds}, value = {_shared}");
}

private static void StartTestInterLocked()
{
_signal.WaitOne();
int counter = 10000000;
for (int i = 0; i < counter; i++)
{
Interlocked.Increment(ref _shared);
}
}

private static void StartTestLocked()
{
_signal.WaitOne();
int counter = 10000000;
for (int i = 0; i < counter; i++)
{
lock (_objLocker)
{
_shared++;
}
}
}
}
}

Now the results are (from a RELEASE build):

Interlocked finished in : 11339, value = 1000000000
Locked finished in : 30546, value = 1000000000

As you can see, Interlocked is substantially faster.



Related Topics



Leave a reply



Submit