C# - Volatile Keyword Usage VS Lock

c# - Volatile keyword usage vs lock

I've used volatile where I'm not sure it is necessary.

Let me be very clear on this point:

If you are not 100% clear on what volatile means in C# then do not use it. It is a sharp tool that is meant to be used by experts only. If you cannot describe what all the possible reorderings of memory accesses are allowed by a weak memory model architecture when two threads are reading and writing two different volatile fields then you do not know enough to use volatile safely and you will make mistakes, as you have done here, and write a program that is extremely brittle.

I was pretty sure a lock would be overkill in my situation

First off, the best solution is to simply not go there. If you don't write multithreaded code that tries to share memory then you don't have to worry about locking, which is hard to get correct.

If you must write multithreaded code that shares memory, then the best practice is to always use locks. Locks are almost never overkill. The price of an uncontended lock is on the order of ten nanoseconds. Are you really telling me that ten extra nanoseconds will make a difference to your user? If so, then you have a very, very fast program and a user with unusually high standards.

The price of a contended lock is of course arbitrarily high if the code inside the lock is expensive. Do not do expensive work inside a lock, so that the probability of contention is low.

Only when you have a demonstrated performance problem with locks that cannot be solved by removing contention should you even begin to consider a low-lock solution.

I added "volatile" to make sure that there is no misalignment occurring: reading only 32 bits of the variable and the other 32 bits on another fetch which can be broken in two by a write in the middle from another thread.

This sentence tells me that you need to stop writing multithreaded code right now. Multithreaded code, particularly low-lock code, is for experts only. You have to understand how the system actually works before you start writing multithreaded code again. Get a good book on the subject and study hard.

Your sentence is nonsensical because:

First off, integers already are only 32 bits.

Second, int accesses are guaranteed by the specification to be atomic! If you want atomicity, you've already got it.

Third, yes, it is true that volatile accesses are always atomic, but that is not because C# makes all volatile accesses into atomic accesses! Rather, C# makes it illegal to put volatile on a field unless the field is already atomic.

Fourth, the purpose of volatile is to prevent the C# compiler, jitter and CPU from making certain optimizations that would change the meaning of your program in a weak memory model. Volatile in particular does not make ++ atomic. (I work for a company that makes static analyzers; I will use your code as a test case for our "incorrect non-atomic operation on volatile field" checker. It is very helpful to me to get real-world code that is full of realistic mistakes; we want to make sure that we are actually finding the bugs that people write, so thanks for posting this.)

Looking at your actual code: volatile is, as Hans pointed out, totally inadequate to make your code correct. The best thing to do is what I said before: do not allow these methods to be called on any thread other than the main thread. That the counter logic is wrong should be the least of your worries. What makes the serialization thread safe if code on another thread is modifying the fields of the object while it is being serialized? That is the problem you should be worried about first.

Volatile vs. Interlocked vs. lock

Worst (won't actually work)

Change the access modifier of counter to public volatile

As other people have mentioned, this on its own isn't actually safe at all. The point of volatile is that multiple threads running on multiple CPUs can and will cache data and re-order instructions.

If it is not volatile, and CPU A increments a value, then CPU B may not actually see that incremented value until some time later, which may cause problems.

If it is volatile, this just ensures the two CPUs see the same data at the same time. It doesn't stop them at all from interleaving their reads and write operations which is the problem you are trying to avoid.

Second Best:

lock(this.locker) this.counter++;

This is safe to do (provided you remember to lock everywhere else that you access this.counter). It prevents any other threads from executing any other code which is guarded by locker.
Using locks also, prevents the multi-CPU reordering problems as above, which is great.

The problem is, locking is slow, and if you re-use the locker in some other place which is not really related then you can end up blocking your other threads for no reason.

Best

Interlocked.Increment(ref this.counter);

This is safe, as it effectively does the read, increment, and write in 'one hit' which can't be interrupted. Because of this, it won't affect any other code, and you don't need to remember to lock elsewhere either. It's also very fast (as MSDN says, on modern CPUs, this is often literally a single CPU instruction).

~~I'm not entirely sure however if it gets around other CPUs reordering things, or if you also need to combine volatile with the increment.~~

InterlockedNotes:

INTERLOCKED METHODS ARE CONCURRENTLY SAFE ON ANY NUMBER OF COREs OR CPUs.
Interlocked methods apply a full fence around instructions they execute, so reordering does not happen.
Interlocked methods do not need or even do not support access to a volatile field, as volatile is placed a half fence around operations on given field and interlocked is using the full fence.

Footnote: What volatile is actually good for.

As volatile doesn't prevent these kinds of multithreading issues, what's it for? A good example is saying you have two threads, one which always writes to a variable (say queueLength), and one which always reads from that same variable.

If queueLength is not volatile, thread A may write five times, but thread B may see those writes as being delayed (or even potentially in the wrong order).

A solution would be to lock, but you could also use volatile in this situation. This would ensure that thread B will always see the most up-to-date thing that thread A has written. Note however that this logic only works if you have writers who never read, and readers who never write, and if the thing you're writing is an atomic value. As soon as you do a single read-modify-write, you need to go to Interlocked operations or use a Lock.

When should the volatile keyword be used in C#?

I don't think there's a better person to answer this than Eric Lippert (emphasis in the original):

In C#, "volatile" means not only "make sure that the compiler and the
jitter do not perform any code reordering or register caching
optimizations on this variable". It also means "tell the processors to
do whatever it is they need to do to ensure that I am reading the
latest value, even if that means halting other processors and making
them synchronize main memory with their caches".

Actually, that last bit is a lie. The true semantics of volatile reads
and writes are considerably more complex than I've outlined here; in
fact they do not actually guarantee that every processor stops what it
is doing and updates caches to/from main memory. Rather, they provide
weaker guarantees about how memory accesses before and after reads and
writes may be observed to be ordered with respect to each other.
Certain operations such as creating a new thread, entering a lock, or
using one of the Interlocked family of methods introduce stronger
guarantees about observation of ordering. If you want more details,
read sections 3.10 and 10.5.3 of the C# 4.0 specification.

Frankly, I discourage you from ever making a volatile field. Volatile
fields are a sign that you are doing something downright crazy: you're
attempting to read and write the same value on two different threads
without putting a lock in place. Locks guarantee that memory read or
modified inside the lock is observed to be consistent, locks guarantee
that only one thread accesses a given chunk of memory at a time, and so
on. The number of situations in which a lock is too slow is very
small, and the probability that you are going to get the code wrong
because you don't understand the exact memory model is very large. I
don't attempt to write any low-lock code except for the most trivial
usages of Interlocked operations. I leave the usage of "volatile" to
real experts.

For further reading see:

Understand the Impact of Low-Lock Techniques in Multithreaded Apps
Sayonara volatile

Should a lock variable be declared volatile?

If you're only ever accessing the data that the lock "guards" while you own the lock, then yes - making those fields volatile is superfluous. You don't need to make the ownerLock_ variable volatile either. (You haven't currently shown any actual code within the lock statement, which makes it hard to talk about in concrete terms - but I'm assuming you'd actually be reading/modifying some data within the lock statement.)

volatile should be very rarely used in application code. If you want lock-free access to a single variable, Interlocked is almost always simpler to reason about. If you want lock-free access beyond that, I would almost always start locking. (Or try to use immutable data structures to start with.)

I'd only expect to see volatile within code which is trying to build higher level abstractions for threading - so within the TPL codebase, for example. It's really a tool for experts who really understand the .NET memory model thoroughly... of whom there are very few, IMO.

Property with Volatile or Lock

For mor information see the great site by J. Albahari:

Synchronization constructs can be divided into four categories:

Simple blocking methods:

These wait for another thread to finish or for a period of time to elapse. Sleep, Join, and Task.Wait are simple blocking methods.

Locking constructs:

These limit the number of threads that can perform some activity or execute a section of code at a time. Exclusive locking constructs are most common — these allow just one thread in at a time, and allow competing threads to access common data without interfering with each other. The standard exclusive locking constructs are lock (Monitor.Enter/Monitor.Exit), Mutex, and SpinLock. The nonexclusive locking constructs are Semaphore, SemaphoreSlim, and the reader/writer locks.

Signaling constructs:

These allow a thread to pause until receiving a notification from another, avoiding the need for inefficient polling. There are two commonly used signaling devices: event wait handles and Monitor’s Wait/Pulse methods. Framework 4.0 introduces the CountdownEvent and Barrier classes.

Non-blocking synchronization constructs:

These protect access to a common field by calling upon processor primitives. The CLR and C# provide the following nonblocking constructs: Thread.MemoryBarrier, Thread.VolatileRead, Thread.VolatileWrite, the volatile keyword, and the Interlocked class.

The volatile keyword:

The volatile keyword instructs the compiler to generate an acquire-fence on every read from that field, and a release-fence on every write to that field. An acquire-fence prevents other reads/writes from being moved before the fence; a release-fence prevents other reads/writes from being moved after the fence. These “half-fences” are faster than full fences because they give the run-time and hardware more scope for optimization.

As it happens, Intel’s X86 and X64 processors always apply acquire-fences to reads and release-fences to writes — whether or not you use the volatile keyword — so this keyword has no effect on the hardware if you’re using these processors. However, volatile does have an effect on optimizations performed by the compiler and the CLR — as well as on 64-bit AMD and (to a greater extent) Itanium processors. This means that you cannot be more relaxed by virtue of your clients running a particular type of CPU.

The effect of applying volatile to fields can be summarized as follows:

First instruction   Second instruction  Can they be swapped?
Read                Read                No
Read                Write               No
Write               Write               No (The CLR ensures that write-write operations are never swapped, even without the volatile keyword)
Write               Read                Yes!

Notice that applying volatile doesn’t prevent a write followed by a read from being swapped, and this can create brainteasers. Joe Duffy illustrates the problem well with the following example: if Test1 and Test2 run simultaneously on different threads, it’s possible for a and b to both end up with a value of 0 (despite the use of volatile on both x and y):

class IfYouThinkYouUnderstandVolatile
{
  volatile int x, y;

  void Test1()        // Executed on one thread
  {
    x = 1;            // Volatile write (release-fence)
    int a = y;        // Volatile read (acquire-fence)
    ...
  }

  void Test2()        // Executed on another thread
  {
    y = 1;            // Volatile write (release-fence)
    int b = x;        // Volatile read (acquire-fence)
    ...
  }
}

The MSDN documentation states that use of the volatile keyword ensures that the most up-to-date value is present in the field at all times. This is incorrect, since as we’ve seen, a write followed by a read can be reordered.

This presents a strong case for avoiding volatile: even if you understand the subtlety in this example, will other developers working on your code also understand it? A full fence between each of the two assignments in Test1 and Test2 (or a lock) solves the problem.

The volatile keyword is not supported with pass-by-reference arguments or captured local variables: in these cases you must use the VolatileRead and VolatileWrite methods.

Is volatile still needed inside lock statements?

Whenever I see a question like this, my knee-jerk reaction is "assume nothing!" The .NET memory model is quite weak and the C# memory model is especially noteworthy for using language that can only apply to a processor with a weak memory model that isn't even supported anymore. Nothing what's there tells you anything what's going to happen in this code, you can reason about locks and memory barriers until you're blue in the face but you don't get anywhere with it.

The x64 jitter is quite clean and rarely throws a surprise. But its days are numbered, it is going to be replaced by Ryujit in VS2015. A rewrite that started with the x86 jitter codebase as a starting point. Which is a concern, the x86 jitter can throw you for a loop. Pun intended.

Best thing to do is to just try it and see what happens. Rewriting your code a little bit and making that loop as tight as possible so the jitter optimizer can do anything it wants:

class Test {
    public bool myBool;
    private static object myLock = new object();
    public int Foo() {
        lock (myLock) {
            int cnt = 0;
            while (!myBool) cnt++;
            return cnt;
        }
    }
}

And testing it like this:

    static void Main(string[] args) {
        var obj = new Test();
        new Thread(() => {
            Thread.Sleep(1000);
            obj.myBool = true;
        }).Start();
        Console.WriteLine(obj.Foo());
    }

Switch to the Release build. Project + Properties, Build tab, tick the "Prefer 32-bit" option. Tools + Options, Debugging, General, untick the "Suppress JIT optimization" option. First run the Debug build. Works fine, program terminates after a second. Now switch to the Release build, run and observe that it deadlocks, the loop never completes. Use Debug + Break All to see that it hangs in the loop.

To see why, look at the generated machine code with Debug + Windows + Disassembly. Focusing on the loop only:

                int cnt = 0;
013E26DD  xor         edx,edx                      ; cnt = 0
                while (myBool) {
013E26DF  movzx       eax,byte ptr [esi+4]         ; load myBool 
013E26E3  test        eax,eax                      ; myBool == true?
013E26E5  jne         013E26EC                     ; yes => bail out
013E26E7  inc         edx                          ; cnt++
013E26E8  test        eax,eax                      ; myBool == true?
013E26EA  jne         013E26E7                     ; yes => loop
                }
                return cnt;

The instruction at address 013E26E8 tells the tale. Note how the myBool variable is stored in the eax register, cnt in the edx register. A standard duty of the jitter optimizer, using the processor registers and avoiding memory loads and stores makes the code much faster. And note that when it tests the value, it still uses the register and does not reload from memory. This loop can therefore never end and it will always hang your program.

Code is pretty fake of course, nobody will ever write this. In practice this tends to work by accident, you'll have more code inside the while() loop. Too much to allow the jitter to optimize the variable way entirely. But there are no hard rules that will tell you when this happens. Sometimes it does pull it off, assume nothing. Proper synchronization should never be skipped. You really are only safe with an extra lock for myBool or an ARE/MRE or Interlocked.CompareExchange(). And if you want to cut such a volatile corner then you must check.

And noted in the comments, try Thread.VolatileRead() instead. You need to use a byte instead of a bool. It still hangs, it is not a synchronization primitive.

Is there any advantage of using volatile keyword in contrast to use the Interlocked class?

EDIT: question largely rewritten

To answer this question, I dived a bit further in the matter and found out a few things about volatile and Interlocked that I wasn't aware of. Let's clear that out, not only for me, but for this discussion and other people reading up on this:

volatile read/write are supposed to be immune to reordering. This only means reading and writing, it does not mean any other action;
volatility is not forced on the CPU, i.e., hardware level (x86 uses acquire and release fences on any read/write). It does prevent compiler or CLR optimizations;
Interlocked uses atomic assembly instructions for CompareExchange (cmpxchg), Increment (inc) etc;
Interlocked does use a lock sometimes: a hardware lock on multi processor systems; in uni-processor systems, there is no hardware lock;
Interlocked is different from volatile in that it uses a full fence, where volatile uses a half fence.
A read following a write can be reordered when you use volatile. It can't happen with Interlocked. VolatileRead and VolatileWrite have the same reordering issue as `volatile (link thanks to Brian Gideon).

Now that we have the rules, we can define an answer to your question:

Technically: yes, there are things you can do with volatile that you cannot do with Interlocked:
1. Syntax: you cannot write a = b where a or b is volatile, but this is obvious;
2. You can read a different value after you write it to a volatile variable because of reordering. You cannot do this with Interlocked. In other words: you can be less safe with volatile then you can be with Interlocked.
3. Performance: volatile is faster then Interlocked.
Semantically: no, because Interlocked simply provides a superset of operations and is safer to use because it applies full fencing. You can't do anything with volatile that you cannot do with Interlocked and you can do a lot with Interlocked that you cannot do with volatile:
```
static volatile int x = 0;
x++;                        // non-atomic
static int y = 0;
Interlocked.Increment(y);   // atomic
```
Scope: yes, declaring a variable volatile makes it volatile for every single access. It is impossible to force this behavior any other way, hence volatile cannot be replaced with Interlocked. This is needed in scenarios where other libraries, interfaces or hardware can access your variable and update it anytime, or need the most recent version.

If you'd ask me, this last bit is the actual real need for volatile and may make it ideal where two processes share memory and need to read or write without locking. Declaring a variable as volatile is much safer in this context then forcing all programmers to use Interlocked (which you cannot force by the compiler).

EDIT: The following quote was part of my original answer, I'll leave it in ;-)

A quote from the the C# Programming Language standard:

For nonvolatile fields,optimization
techniques that consider that reorder
instructions can lead to unexpected
and unpredictable results in
multithreaded programs that access
fields without synchronization such as
that provided by the lock-statement.
These optimizationscan be performed by
the compiler, by the runtime system,
or by hardware. For volatile fields,
such reordering optimizations are
restricted:

A read of a volatile field is called a volatile read. A volatile read
has :acquire semantics"; that is, it
is guaranteed to occur prior to any
references to memory that occur after
it in the instruction sequence.

A write of a volatile field is called a volatile write. A
volatile write has "release
semantics"; that is, it is guaranteed
to happen after any memory references
prior to the write instruction in the
instruction sequence.

Update: question largely rewritten, corrected my original response and added a "real" answer

Is the 'volatile' keyword still broken in C#?

Volatile in its current implementation is not broken despite popular blog posts claiming such a thing. It is however badly specified and the idea of using a modifier on a field to specify memory ordering is not that great (compare volatile in Java/C# to C++'s atomic specification that had enough time to learn from the earlier mistakes). The MSDN article on the other hand was clearly written by someone who has no business talking about concurrency and is completely bogus.. the only sane option is to completely ignore it.

Volatile guarantees acquire/release semantics when accessing the field and can only be applied to types that allow atomic reads and writes. Not more, not less. This is enough to be useful to implement many lock-free algorithms efficiently such as non-blocking hashmaps.

One very simple sample is using a volatile variable to publish data. Thanks to the volatile on x, the assertion in the following snippet cannot fire:

private int a;
private volatile bool x;

public void Publish()
{
    a = 1;
    x = true;
}

public void Read()
{
    if (x)
    {
        // if we observe x == true, we will always see the preceding write to a
        Debug.Assert(a == 1); 
    }
}

Volatile is not easy to use and in most situations you are much better off to go with some higher level concept, but when performance is important or you're implementing some low level data structures, volatile can be exceedingly useful.

Are volatile variables useful? If yes then when?

This question is very confusing. Let me try to break it down.

Are volatile variables useful?

Yes. The C# team would not have added a useless feature.

If yes then when?

Volatile variables are useful in certain highly performance-sensitive multithreaded applications where the application architecture is predicated on sharing memory across threads.

As an editorial aside, I note that it should be rare for normal line-of-business C# programmers to be in any of these situations. First, the performance characteristics we are talking about here are on the order of tens of nanoseconds; most LOB applications have performance requirements measured in seconds or minutes, not in nanoseconds. Second, most LOB C# applications can do their work with only a small number of threads. Third, shared memory is a bad idea and a cause of many bugs; LOB applications which use worker threads should not use threads directly, but rather use the Task Parallel Library to safely instruct worker threads to perform calculations, and then return the results. Consider using the new await keyword in C# 5.0 to facilitate task-based asynchrony, rather than using threads directly.

Any use of volatile in a LOB application is a big red flag and should be heavily reviewed by experts, and ideally eliminated in favour of a higher-level, less dangerous practice.

lock will prevent instruction reordering.

A lock is described by the C# specification as being a special point in the code such that certain special side effects are guaranteed to be ordered in a particular way with respect to entering and leaving the lock.

volatile because will force CPU to always read value from memory (then different CPUs/cores won't cache it and they won't see old values).

What you are describing is implementation details for how volatile could be implemented; there is not a requirement that volatile be implemented by abandoning caches and going back to main memory. The requirements of volatile are spelled out in the specification.

Interlocked operations perform change + assignment in a single atomic (fast) operation.

It is not clear to me why you have parenthesized "fast" after "atomic"; "fast" is not a synonym for "atomic".

How lock will prevent cache problem?

Again: lock is documented as being a special event in the code; a compiler is required to ensure that other special events have a particular order with respect to the lock. How the compiler chooses to implement those semantics is an implementation detail.

Is it implicit a memory barrier in a critical section?

In practice yes, a lock introduces a full fence.

Volatile variables can't be local

Correct. If you are accessing a local from two threads then the local must be a special local: it could be a closed-over outer variable of a delegate, or in an async block, or in an iterator block. In all cases the local is actually realized as a field. If you want such a thing to be volatile then do not use high-level features like anonymous methods, async blocks or iterator blocks! That is mixing the highest level and the lowest level of C# coding and that is a very strange thing to do. Write your own closure class and make the fields volatile as you see fit.

I read something from Eric Lippert about this but I can't find that post now and I don't remember his answer.

Well I don't remember it either, so I typed "Eric Lippert Why can't a local variable be volatile" into a search engine. That took me to this question:

why can't a local variable be volatile in C#?

Perhaps that is what you're thinking of.

This makes me think they're not implemented with an Interlocked.CompareExchange() and friends.

C# implements volatile fields as volatile fields. Volatile fields are a fundamental concept in the CLR; how the CLR implements them is an implementation detail of the CLR.

in what they're different?

I don't understand the question.

What volatile modifier will do for example in this code?

++_volatileField;

It does nothing helpful, so don't do that. Volatility and atomicity are completely different things. Doing a normal increment on a volatile field does not make the increment into an atomic increment.

Moreover what compiler (beside warnings) will do here:

The C# compiler really ought to suppress that warning if the method being called introduces a fence, as this one does. I never managed to get that into the compiler. Hopefully the team will someday.

The volatile field will be updated in an atomic manner. A fence will be introduced by the increment, so the fact that the volatile half-fences are skipped is mitigated.

How is it possible for non volatile fields?

That's an implementation detail of the CLR.

Does they imply barriers too?

Yes, the interlocked operations introduce barriers. Again, this is an implementation detail.

Doesn't this hurt performance a lot (compared to volatile)?

First off, comparing the performance of broken code to working code is a waste of time.

Second, if you do feel like wasting time, you are perfectly capable of measuring the performance of each yourself. Write the code both ways, get out a stopwatch, run it a trillion times each way, and you'll know which is faster.

C# - Volatile Keyword Usage VS Lock