C++ Move Semantics and Exceptions

Does an exception use move semantics when thrown in C++11?

I have just checked, and the Standard allows

  • omitting the copy or move of an object specified by the operand of a throw expression into the exception object
  • omitting the copy or move of the exception object into the catch clause variable of the same type as the exception object if you don't otherwise change the meaning of the program (i.e if you would rethrow and subsequent catches would suddenly see a changed exception object changed by the previous catch block).

Since these omissions are allowed, the spec requires to first regard the source of the copy or move as an rvalue. So this means that the respective objects will be moved if possible. Of course copy and move elision are still allowed as the first choice.


Update

I was notified that the consideration of the exception object initializer of a catch clause parameter as an rvalue initializer will probably be dropped from the Standard (because in general it is not possible for all cases to detect when the behavior of the program is unchanged when omitting a copy/move), so I recommend to not rely on this (second bullet above).

What you can still rely about is the move of a local variable into the exception object, as in throw x; (first bullet above).

Exception safe code and move semantics

Use std::move_if_noexcept when writing exception-sensitive code, but still want to use move semantics when it's compile-time safe to do so.

See Scott Meyers' talk at GoingNative 2013 for more details on that.

PS: Oh yes, pls remember that if your stuff is not copy constructible, you're gonna move it regardless of throw/nothrow of your move constructor.

What is move semantics?

I find it easiest to understand move semantics with example code. Let's start with a very simple string class which only holds a pointer to a heap-allocated block of memory:

#include <cstring>
#include <algorithm>

class string
{
char* data;

public:

string(const char* p)
{
size_t size = std::strlen(p) + 1;
data = new char[size];
std::memcpy(data, p, size);
}

Since we chose to manage the memory ourselves, we need to follow the rule of three. I am going to defer writing the assignment operator and only implement the destructor and the copy constructor for now:

    ~string()
{
delete[] data;
}

string(const string& that)
{
size_t size = std::strlen(that.data) + 1;
data = new char[size];
std::memcpy(data, that.data, size);
}

The copy constructor defines what it means to copy string objects. The parameter const string& that binds to all expressions of type string which allows you to make copies in the following examples:

string a(x);                                    // Line 1
string b(x + y); // Line 2
string c(some_function_returning_a_string()); // Line 3

Now comes the key insight into move semantics. Note that only in the first line where we copy x is this deep copy really necessary, because we might want to inspect x later and would be very surprised if x had changed somehow. Did you notice how I just said x three times (four times if you include this sentence) and meant the exact same object every time? We call expressions such as x "lvalues".

The arguments in lines 2 and 3 are not lvalues, but rvalues, because the underlying string objects have no names, so the client has no way to inspect them again at a later point in time.
rvalues denote temporary objects which are destroyed at the next semicolon (to be more precise: at the end of the full-expression that lexically contains the rvalue). This is important because during the initialization of b and c, we could do whatever we wanted with the source string, and the client couldn't tell a difference!

C++0x introduces a new mechanism called "rvalue reference" which, among other things,
allows us to detect rvalue arguments via function overloading. All we have to do is write a constructor with an rvalue reference parameter. Inside that constructor we can do anything we want with the source, as long as we leave it in some valid state:

    string(string&& that)   // string&& is an rvalue reference to a string
{
data = that.data;
that.data = nullptr;
}

What have we done here? Instead of deeply copying the heap data, we have just copied the pointer and then set the original pointer to null (to prevent 'delete[]' from source object's destructor from releasing our 'just stolen data'). In effect, we have "stolen" the data that originally belonged to the source string. Again, the key insight is that under no circumstance could the client detect that the source had been modified. Since we don't really do a copy here, we call this constructor a "move constructor". Its job is to move resources from one object to another instead of copying them.

Congratulations, you now understand the basics of move semantics! Let's continue by implementing the assignment operator. If you're unfamiliar with the copy and swap idiom, learn it and come back, because it's an awesome C++ idiom related to exception safety.

    string& operator=(string that)
{
std::swap(data, that.data);
return *this;
}
};

Huh, that's it? "Where's the rvalue reference?" you might ask. "We don't need it here!" is my answer :)

Note that we pass the parameter that by value, so that has to be initialized just like any other string object. Exactly how is that going to be initialized? In the olden days of C++98, the answer would have been "by the copy constructor". In C++0x, the compiler chooses between the copy constructor and the move constructor based on whether the argument to the assignment operator is an lvalue or an rvalue.

So if you say a = b, the copy constructor will initialize that (because the expression b is an lvalue), and the assignment operator swaps the contents with a freshly created, deep copy. That is the very definition of the copy and swap idiom -- make a copy, swap the contents with the copy, and then get rid of the copy by leaving the scope. Nothing new here.

But if you say a = x + y, the move constructor will initialize that (because the expression x + y is an rvalue), so there is no deep copy involved, only an efficient move.
that is still an independent object from the argument, but its construction was trivial,
since the heap data didn't have to be copied, just moved. It wasn't necessary to copy it because x + y is an rvalue, and again, it is okay to move from string objects denoted by rvalues.

To summarize, the copy constructor makes a deep copy, because the source must remain untouched.
The move constructor, on the other hand, can just copy the pointer and then set the pointer in the source to null. It is okay to "nullify" the source object in this manner, because the client has no way of inspecting the object again.

I hope this example got the main point across. There is a lot more to rvalue references and move semantics which I intentionally left out to keep it simple. If you want more details please see my supplementary answer.

Move constructor is not called when throwing an exception

This is an MSVC bug. From [except.throw]:

Throwing an exception copy-initializes (8.5, 12.8) a temporary object, called the exception object.

That means we do:

ThrowMoveTest __exception_object = move(tmt1);

which should definitely call the move constructor.


Note that the move here is unnecessary and also damaging. [class.copy] stipulates that copy/move construction can be elided

— in a throw-expression (5.17), when the operand is the name of a non-volatile automatic object (other than
a function or catch-clause parameter) whose scope does not extend beyond the end of the innermost
enclosing try-block (if there is one), the copy/move operation from the operand to the exception
object (15.1) can be omitted by constructing the automatic object directly into the exception object

So simply throw tmt1; would have allowed for tmt1 to be constructed directly into the exception object. Although neither gcc nor clang do this.

And even if the copy/move is not elided:

When the criteria for elision of a copy/move operation are met, but not for an exception-declaration, and the
object to be copied is designated by an lvalue [...] overload resolution
to select the constructor for the copy is first performed as if the object were designated by an rvalue.

So throw tmt1; would still move-construct the exception object.

Why doesn't std::exception have a move constructor?

The descendants of std::exception do own data. For example std::runtime_error owns its what() message. And that message is dynamically allocated because it can be an arbitrarily long message.

However, the copy constructor is marked noexcept (implicitly) because the std::exception copy constructor is noexcept.

#include <stdexcept>
#include <type_traits>

int
main()
{
static_assert(std::is_nothrow_copy_constructible<std::runtime_error>{});
}

The only way for a class to own a dynamically allocated message, and have a noexcept copy constructor, is for that ownership to be shared (reference counted). So std::runtime_error is essentially a const, reference counted string.

There was simply no motivation to give these types a move constructor because the copy constructor is not only already very fast, but the exceptional path of program is only executed in exceptional circumstances. About the only thing a move constructor for std::runtime_error could do is eliminate an atomic increment/decrement. And no one cared.

Wouldn't having a move-constructor allow cheaply catching by value and thus simplifying the guidelines?

You can already cheaply catch by value. But the guideline exists because exceptions are often part of an inheritance hierarchy, and catching by value would slice the exception:

#include <exception>
#include <iostream>
#include <stdexcept>

int
main()
{
try
{
throw std::runtime_error("my message");
}
catch (std::exception e)
{
std::cout << e.what() << '\n';
}
}

Output (for me):

std::exception

Are C++11 move semantics doing something new, or just making semantics clearer?

TL;DR

This is definitely something new and it goes well beyond just being a way to avoid copying memory.

Long Answer: Why it's new and some perhaps non-obvious implications

Move semantics are just what the name implies--that is, a way to explicitly declare instructions for moving objects rather than copying. In addition to the obvious efficiency benefit, this also affords a programmer a standards-compliant way to have objects that are movable but not copyable. Objects that are movable and not copyable convey a very clear boundary of resource ownership via standard language semantics. This was possible in the past, but there was no standard/unified (or STL-compatible) way to do this.

This is a big deal because having a standard and unified semantic benefits both programmers and compilers. Programmers don't have to spend time potentially introducing bugs into a move routine that can reliably be generated by compilers (most cases); compilers can now make appropriate optimizations because the standard provides a way to inform the compiler when and where you're doing standard moves.

Move semantics is particularly interesting because it very well suits the RAII idiom, which is a long-standing a cornerstone of C++ best practice. RAII encompasses much more than just this example, but my point is that move semantics is now a standard way to concisely express (among other things) movable-but-not-copyable objects.

You don't always have to explicitly define this functionality in order to prevent copying. A compiler feature known as "copy elision" will eliminate quite a lot of unnecessary copies from functions that pass by value.

Criminally-Incomplete Crash Course on RAII (for the uninitiated)

I realize you didn't ask for a code example, but here's a really simple one that might benefit a future reader who might be less familiar with the topic or the relevance of Move Semantics to RAII practices. (If you already understand this, then skip the rest of this answer)

// non-copyable class that manages lifecycle of a resource
// note: non-virtual destructor--probably not an appropriate candidate
// for serving as a base class for objects handled polymorphically.
class res_t {
using handle_t = /* whatever */;
handle_t* handle; // Pointer to owned resource
public:
res_t( const res_t& src ) = delete; // no copy constructor
res_t& operator=( const res_t& src ) = delete; // no copy-assignment

res_t( res_t&& src ) = default; // Move constructor
res_t& operator=( res_t&& src ) = default; // Move-assignment

res_t(); // Default constructor
~res_t(); // Destructor
};

Objects of this class will allocate/provision whatever resource is needed upon construction and then free/release it upon destruction. Since the resource pointed to by the data member can never accidentally be transferred to another object, the rightful owner of a resource is never in doubt. In addition to making your code less prone to abuse or errors (and easily compatible with STL containers), your intentions will be immediately recognized by any programmer familiar with this standard practice.

Sink arguments and move semantics for functions that can fail (strong exception safety)

Apparently this issue was discussed lively at the recent CppCon 2014. Herb Sutter summarized the latest state of things in his closing talk, Back to the Basics! Essentials of Modern C++ Style (slides).

His conclusion is quite simply: Don't use pass-by-value for sink arguments.

The arguments for using this technique in the first place (as popularized by Eric Niebler's Meeting C++ 2013 keynote C++11 Library design (slides)) seem to be outweighed by the disadvantages. The initial motivation for passing sink arguments by-value was to get rid of the combinatorial explosion for function overloads that results from using const&/&&.

Unfortunately, it seems that this brings a number of unintended consequences. One of which are potential efficiency drawbacks (mainly due to unnecessary buffer allocations). The other is the problem with exception safety from this question. Both of these are discussed in Herb's talk.

Herb's conclusion is to not use pass-by-value for sink arguments, but instead rely on separate const&/&& (with const& being the default and && reserved for those few cases where optimization is required).

This also matches with what @Potatoswatter's answer suggested. By passing the sink argument via && we might be able to defer the actual moving of the data from the argument to a point where we can give a noexcept guarantee.

I kind of liked the idea of passing sink arguments by-value, but it seems that it does not hold up as well in practice as everyone hoped.

Update after thinking about this for 5 years:

I am now convinced that my motivating example is a misuse of move semantics. After the invocation of processBigData(std::move(b));, I should never be allowed to assume what the state of b is, even if the function exits with an exception. Doing so leads to code that is hard to follow and to maintain.

Instead, if the contents of b should be recoverable in the error case, this needs to be made explicit in the code. For example:

class BigDataException : public std::runtime_error {
private:
BigData b;
public:
BigData retrieveDataAfterError() &&;

// [...]
};

BigData b = retrieveData();
Result r;
try {
r = processBigData(std::move(b));
} catch(BigDataException& e) {
b = std::move(e).retrieveDataAfterError();
r = fixEnvironmnentAndTryAgain(std::move(b));
}

If I want to recover the contents of b, I need to explicitly pass them out along the error path (in this case wrapped inside the BigDataException). This approach requires a bit of additional boilerplate, but it is more idiomatic in that it does not require making assumptions about the state of a moved-from object.



Related Topics



Leave a reply



Submit