Why Would Try/Finally Rather Than a "Using" Statement Help Avoid a Race Condition

Why would try/finally rather than a using statement help avoid a race condition?

Yeah, there is a possible race in the using statement. The C# compiler transforms

using (var obj = new Foo()) {
    // statements
}

to:

var obj = new Foo();
try {
   // statements
}
finally {
   if (obj != null) obj.Dispose();
}

The race occurs when the thread is aborted right between the obj assignment statement and the try block. Extremely small odds but not zero. The object won't be disposed when that happens. Note how he rewrote that code by moving the assignment inside the try block so this race cannot occur. Nothing actually goes fundamentally wrong when the race occurs, disposing objects is not a requirement.

Having to choose between making thread aborts marginally more efficient and writing using statements by hand, you should first opt for not getting in the habit of using Thread.Abort(). I can't recommend actually doing this, the using statement has additional safety measures to ensure accidents don't happen, it makes sure that the original object gets disposed even when the object is re-assigned inside the using statement. Adding catch clauses is less prone to accidents as well. The using statement exists to reduce the likelihood of bugs, always use it.

Noodling on a bit about this problem, the answer is popular, there's another common C# statement that suffers from the exact same race. It looks like this:

lock (obj) {
    // statements
}

Translated to:

Monitor.Enter(obj);
// <=== Eeeek!
try {
    // statements
}
finally {
    Monitor.Exit(obj);
}

Exact same scenario, the thread abort can strike after the Enter() call and before entering the try block. Which prevents the Exit() call from being made. This is way nastier than a Dispose() call that isn't made of course, this is almost certainly going to cause deadlock. The problem is specific to the x64 jitter, the sordid details are described well in this Joe Duffy blog post.

It is very hard to reliably fix this one, moving the Enter() call inside the try block can't solve the problem. You cannot be sure that the Enter call was made so you cannot reliably call the Exit() method without possibly triggering an exception. The Monitor.ReliableEnter() method that Duffy was talking about did eventually happen. The .NET 4 version of Monitor got a TryEnter() overload that takes a ref bool lockTaken argument. Now you know it is okay to call Exit().

Well, scary stuff that goes BUMP in the night when you are not looking. Writing code that's safely interruptable is hard. You'd be wise to never assume that code that you didn't write got all of this taken care of. Testing such code is extremely difficult since the race is so rare. You can never be sure.

Java try-finally race condition?

what if there's a thread interruption right between the openResource()-call and entering the try-clause?

Then the thread won't throw an InterruptedException until it hits some blocking call. That can't happen before it gets into the try block, because there aren't any more blocking calls, assuming the method actually returns. From the docs for InterruptedException:

Thrown when a thread is waiting, sleeping, or otherwise occupied, and the thread is interrupted, either before or during the activity. Occasionally a method may wish to test whether the current thread has been interrupted, and if so, to immediately throw this exception.

Note that even if you do put the acquisition inside the try block, that doesn't really prevent any race condition that would otherwise exist - because you'd be relying on the method or constructor returning in the first place. If an exception can happen after the method/constructor returns, why can't it happen just before it returns, but after the resource has been acquired? If that happens, there's nothing you can call close on...

I'd still recommend using the try-with-resources statement in Java 7, but if you look at the JLS section 14.20.3.1 you'll see that the expansion is like your first piece of code:

The meaning of a basic try-with-resources statement:

try ({VariableModifier} R Identifier = Expression ...)
    Block

is given by the following translation to a local variable declaration and a try-catch-finally statement:

{
    final {VariableModifierNoFinal} R Identifier = Expression;
    Throwable #primaryExc = null;

    try ResourceSpecification_tail
        Block
    catch (Throwable #t) {
        ...
        #primaryExc = #t;
        throw #t;
    } finally {
       ...
    }
}

Is C#'s using statement abort-safe?

The book's companion web site has more info on aborting threads here.

In short, the first translation is correct (you can tell by looking at the IL).

The answer to your second question is that there may be scenarios where the variable can be legitimately null. For instance, GetFoo() may return null here, in which you wouldn't want a NullReferenceException thrown in the implicit finally block:

using (var x = GetFoo())
{
   ...
}

To answer your third question, the only way to make Abort safe (if you're calling Framework code) is to tear down the AppDomain afterward. This is actually a practical solution in many cases (it's exactly what LINQPad does whenever you cancel a running query).

`using` statement with uninitialized variable not possible, nor is finally, so how to properly dispose it?

Actually, the compiler says that you are using a potentially uninitialized variable, so just initialize it:

DisposableObject worker = null;
try
{
    if(/*condition*/)
        worker = new DisposableObject(/*args*/)
    else
        worker = new DisposableObject(/*other args*/)
    worker.DoStuff();
}
finally
{
    if(worker != null)
        worker.Dispose();
}

does aborting a thread while in 'using' block dispose used instance

Most likely it will, but you can't be sure.

According to the documentation:

When this method [Abort] is invoked on a thread, the system throws a ThreadAbortException in the thread to abort it.

And we know exceptions will still let using statements dispose, as they should. (Give and take a few exceptions)

On the other hand, if you can end the thread gracefully, for example with a CancellationTokenSource, it would be a lot nicer for your app. It will offer much more control over the actual termination of your thread and the handling of exceptions.

What is a race condition?

A race condition occurs when two or more threads can access shared data and they try to change it at the same time. Because the thread scheduling algorithm can swap between threads at any time, you don't know the order in which the threads will attempt to access the shared data. Therefore, the result of the change in data is dependent on the thread scheduling algorithm, i.e. both threads are "racing" to access/change the data.

Problems often occur when one thread does a "check-then-act" (e.g. "check" if the value is X, then "act" to do something that depends on the value being X) and another thread does something to the value in between the "check" and the "act". E.g:

if (x == 5) // The "Check"
{
   y = x * 2; // The "Act"

   // If another thread changed x in between "if (x == 5)" and "y = x * 2" above,
   // y will not be equal to 10.
}

The point being, y could be 10, or it could be anything, depending on whether another thread changed x in between the check and act. You have no real way of knowing.

In order to prevent race conditions from occurring, you would typically put a lock around the shared data to ensure only one thread can access the data at a time. This would mean something like this:

// Obtain lock for x
if (x == 5)
{
   y = x * 2; // Now, nothing can change x until the lock is released. 
              // Therefore y = 10
}
// release lock for x

Does this JavaScript example create “race conditions”? (To the extent that they can exist in JavaScript)

Yes, race conditions can and do occur in JS as well. Just because it is single-threaded it doesn't mean race conditions can't happen (although they are rarer). JavaScript indeed is single-threaded but it is also asynchronous: a logical sequence of instructions is often divided into smaller chunks executed at different times. This makes interleaving possible, and hence race conditions arise.

For the simple example consider...

var x = 1;

async function foo() {
    var y = x;
    await delay(100); // whatever async here
    x = y+1;
}

...which is the classical example of the non-atomic increment adapted to JavaScript's asynchronous world.

Now compare the following "parallel" execution:

await Promise.all([foo(), foo(), foo()]);
console.log(x);  // prints 2

...with the "sequential" one:

await foo();
await foo();
await foo();
console.log(x);  // prints 4

Note that the results are different, i.e. foo() is not "async safe".

Even in JS you sometimes have to use "async mutexes". And your example might be one of those situations, depending on what happens in between (e.g. if some asynchronous call occurs). Without an asynchronous call in do more stuff it looks like mutation occurs in a single block of code (bounded by asynchronous calls, but no asynchronous call inside to allow interleaving), and should be OK I think. Note that in your example the assignment in a is after await, while b is called before the final await.

Which, and why, do you prefer Exceptions or Return codes?

For some languages (i.e. C++) Resources leak should not be a reason

C++ is based on RAII.

If you have code that could fail, return or throw (that is, most normal code), then you should have your pointer wrapped inside a smart pointer (assuming you have a very good reason to not have your object created on stack).

Return codes are more verbose

They are verbose, and tend to develop into something like:

if(doSomething())
{
   if(doSomethingElse())
   {
      if(doSomethingElseAgain())
      {
          // etc.
      }
      else
      {
         // react to failure of doSomethingElseAgain
      }
   }
   else
   {
      // react to failure of doSomethingElse
   }
}
else
{
   // react to failure of doSomething
}

In the end, you code is a collection of idented instructions (I saw this kind of code in production code).

This code could well be translated into:

try
{
   doSomething() ;
   doSomethingElse() ;
   doSomethingElseAgain() ;
}
catch(const SomethingException & e)
{
   // react to failure of doSomething
}
catch(const SomethingElseException & e)
{
   // react to failure of doSomethingElse
}
catch(const SomethingElseAgainException & e)
{
   // react to failure of doSomethingElseAgain
}

Which cleanly separate code and error processing, which can be a good thing.

Return codes are more brittle

If not some obscure warning from one compiler (see "phjr" 's comment), they can easily be ignored.

With the above examples, assume than someone forgets to handle its possible error (this happens...). The error is ignored when "returned", and will possibly explode later (i.e. a NULL pointer). The same problem won't happen with exception.

The error won't be ignored. Sometimes, you want it to not explode, though... So you must chose carefully.

Return Codes must sometimes be translated

Let's say we have the following functions:

doSomething, which can return an int called NOT_FOUND_ERROR
doSomethingElse, which can return a bool "false" (for failed)
doSomethingElseAgain, which can return an Error object (with both the __LINE__, __FILE__ and half the stack variables.
doTryToDoSomethingWithAllThisMess which, well... Use the above functions, and return an error code of type...

What is the type of the return of doTryToDoSomethingWithAllThisMess if one of its called functions fail ?

Return Codes are not a universal solution

Operators cannot return an error code. C++ constructors can't, too.

Return Codes means you can't chain expressions

The corollary of the above point. What if I want to write:

CMyType o = add(a, multiply(b, c)) ;

I can't, because the return value is already used (and sometimes, it can't be changed). So the return value becomes the first parameter, sent as a reference... Or not.

Exception are typed

You can send different classes for each kind of exception. Ressources exceptions (i.e. out of memory) should be light, but anything else could be as heavy as necessary (I like the Java Exception giving me the whole stack).

Each catch can then be specialized.

Don't ever use catch(...) without re-throwing

Usually, you should not hide an error. If you do not re-throw, at the very least, log the error in a file, open a messagebox, whatever...

Exception are... NUKE

The problem with exception is that overusing them will produce code full of try/catches. But the problem is elsewhere: Who try/catch his/her code using STL container? Still, those containers can send an exception.

Of course, in C++, don't ever let an exception exit a destructor.

Exception are... synchronous

Be sure to catch them before they bring out your thread on its knees, or propagate inside your Windows message loop.

The solution could be mixing them?

So I guess the solution is to throw when something should not happen. And when something can happen, then use a return code or a parameter to enable to user to react to it.

So, the only question is "what is something that should not happen?"

It depends on the contract of your function. If the function accepts a pointer, but specifies the pointer must be non-NULL, then it is ok to throw an exception when the user sends a NULL pointer (the question being, in C++, when didn't the function author use references instead of pointers, but...)

Another solution would be to show the error

Sometimes, your problem is that you don't want errors. Using exceptions or error return codes are cool, but... You want to know about it.

In my job, we use a kind of "Assert". It will, depending on the values of a configuration file, no matter the debug/release compile options:

log the error
open a messagebox with a "Hey, you have a problem"
open a messagebox with a "Hey, you have a problem, do you want to debug"

In both development and testing, this enable the user to pinpoint the problem exactly when it is detected, and not after (when some code cares about the return value, or inside a catch).

It is easy to add to legacy code. For example:

void doSomething(CMyObject * p, int iRandomData)
{
   // etc.
}

leads a kind of code similar to:

void doSomething(CMyObject * p, int iRandomData)
{
   if(iRandomData < 32)
   {
      MY_RAISE_ERROR("Hey, iRandomData " << iRandomData << " is lesser than 32. Aborting processing") ;
      return ;
   }

   if(p == NULL)
   {
      MY_RAISE_ERROR("Hey, p is NULL !\niRandomData is equal to " << iRandomData << ". Will throw.") ;
      throw std::some_exception() ;
   }

   if(! p.is Ok())
   {
      MY_RAISE_ERROR("Hey, p is NOT Ok!\np is equal to " << p->toString() << ". Will try to continue anyway") ;
   }

   // etc.
}

(I have similar macros that are active only on debug).

Note that on production, the configuration file does not exist, so the client never sees the result of this macro... But it is easy to activate it when needed.

Conclusion

When you code using return codes, you're preparing yourself for failure, and hope your fortress of tests is secure enough.

When you code using exception, you know that your code can fail, and usually put counterfire catch at chosen strategic position in your code. But usually, your code is more about "what it must do" then "what I fear will happen".

But when you code at all, you must use the best tool at your disposal, and sometimes, it is "Never hide an error, and show it as soon as possible". The macro I spoke above follow this philosophy.

Why Would Try/Finally Rather Than a "Using" Statement Help Avoid a Race Condition