What Optimization Does Move Semantics Provide If We Already Have Rvo

What optimization does move semantics provide if we already have RVO?

After some digging I find this excellent example of optimization with rvalue references inStroustrup's FAQ .

Yes, swap function:

    template<class T> 
void swap(T& a, T& b)   // "perfect swap" (almost)
{
    T tmp = move(a);    // could invalidate a
    a = move(b);        // could invalidate b
    b = move(tmp);      // could invalidate tmp
}

This will generate optimized code for any kind of types (assuming, that it have move constructor).

Edit: Also RVO can't optimize something like this(at least on my compiler):

stuff func(const stuff& st)
{
    if(st.x>0)
    {
        stuff ret(2*st.x);
        return ret;
    }
    else
    {
        stuff ret2(-2*st.x);
        return ret2;
    }
}

This function always calls copy constructor (checked with VC++). And if our class can be moved faster, than with move constructor we will have optimization.

Efficient use of move semantics together with (N)RVO

I like to measure, so I set up this Object:

#include <iostream>

struct Object
{
    Object() {}
    Object(const Object&) {std::cout << "Object(const Object&)\n";}
    Object(Object&&) {std::cout << "Object(Object&&)\n";}

    Object& makeChanges() {return *this;}
};

And I theorized that some solutions may give different answers for xvalues and prvalues (both of which are rvalues). And so I decided to test both of them (in addition to lvalues):

Object source() {return Object();}

int main()
{
    std::cout << "process lvalue:\n\n";
    Object x;
    Object t = process(x);
    std::cout << "\nprocess xvalue:\n\n";
    Object u = process(std::move(x));
    std::cout << "\nprocess prvalue:\n\n";
    Object v = process(source());
}

Now it is a simple matter of trying all of your possibilities, those contributed by others, and I threw one in myself:

#if PROCESS == 1

Object
process(Object arg)
{
    return arg.makeChanges();
}

#elif PROCESS == 2

Object
process(const Object& arg)
{
    return Object(arg).makeChanges();
}

Object
process(Object&& arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 3

Object
process(const Object& arg)
{
    Object retObj = arg;
    retObj.makeChanges();
    return retObj; 
}

Object
process(Object&& arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 4

Object
process(Object arg)
{
    return std::move(arg.makeChanges());
}

#elif PROCESS == 5

Object
process(Object arg)
{
    arg.makeChanges();
    return arg;
}

#endif

The table below summarizes my results (using clang -std=c++11). The first number is the number of copy constructions and the second number is the number of move constructions:

+----+--------+--------+---------+
|    | lvalue | xvalue | prvalue |    legend: copies/moves
+----+--------+--------+---------+
| p1 |  2/0   |  1/1   |   1/0   |
+----+--------+--------+---------+
| p2 |  2/0   |  0/1   |   0/1   |
+----+--------+--------+---------+
| p3 |  1/0   |  0/1   |   0/1   |
+----+--------+--------+---------+
| p4 |  1/1   |  0/2   |   0/1   |
+----+--------+--------+---------+
| p5 |  1/1   |  0/2   |   0/1   |
+----+--------+--------+---------+

process3 looks like the best solution to me. However it does require two overloads. One to process lvalues and one to process rvalues. If for some reason this is problematic, solutions 4 and 5 do the job with only one overload at the cost of 1 extra move construction for glvalues (lvalues and xvalues). It is a judgement call as to whether one wants to pay an extra move construction to save overloading (and there is no one right answer).

(answered) Why does RVO kick in the last option and not the second?

For RVO to kick in, the return statement needs to look like:

return arg;

If you complicate that with:

return std::move(arg);

or:

return arg.makeChanges();

then RVO gets inhibited.

Is there a better way to do this?

My favorites are p3 and p5. My preference of p5 over p4 is merely stylistic. I shy away from putting move on the return statement when I know it will be applied automatically for fear of accidentally inhibiting RVO. However in p5 RVO is not an option anyway, even though the return statement does get an implicit move. So p5 and p4 really are equivalent. Pick your style.

Had we passed in a temporary, 2nd and 3rd options would call a move
constructor while returning. Is is possible to eliminate that using
(N)RVO?

The "prvalue" column vs "xvalue" column addresses this question. Some solutions add an extra move construction for xvalues and some don't.

Should we write `std::move` in the cases when RVO can not be done?

What happen in your example is not linked to RVO, but to the ternary operator ?. If you rewrite your example code using an if statement, the behavior of the program will be the one expected. Change foo definition to:

Test foo(int param)
  {
  Test test1;
  Test test2;
  if (param > 5)
    return std::move(test2);
  else
    return test1;
  }

will output Test(Test&&).

What happens if you write (param>5)?std::move(test1):test2 is:

The ternary operator result is deduced to be a prvalue [expr.cond]/5
Then test2 pass through lvalue-to-rvalue conversion which causes copy-initialization as required in [expr.cond]/6
Then the move construction of the return value is elided [class.copy]/31.3

So in your example code, move elision occurs, nevertheless after the copy-initialization required to form the result of the ternary operator.

c++11 Return value optimization or move?

Use exclusively the first method:

Foo f()
{
  Foo result;
  mangle(result);
  return result;
}

This will already allow the use of the move constructor, if one is available. In fact, a local variable can bind to an rvalue reference in a return statement precisely when copy elision is allowed.

Your second version actively prohibits copy elision. The first version is universally better.

When will a C++11 compiler make RVO and NRVO outperform move semantics and const reference binding?

std::move(build_report()) is wholly unnecessary: build_report() is already an rvalue expression (it is a call of a function that returns an object by value), so the std::wstring move constructor will be used if it has one (it does).

Plus, when you return a local variable, it gets moved if it is of a type that has a move constructor, so no copies will be made, period.

There shouldn't be any functional difference between declaring report as an object or as a const-reference; in both cases you end up with an object (either the named report object or an unnamed object to which the report reference can be bound).

Can we use the return value optimization when possible and fall back on move, not copy, semantics when not?

When the expression in the return statement is a non-volatile automatic duration object, and not a function or catch-clause parameter, with the same cv-unqualified type as the function return type, the resulting copy/move is eligible for copy elision. The standard also goes on to say that, if the only reason copy elision was forbidden was that the source object was a function parameter, and if the compiler is unable to elide a copy, the overload resolution for the copy should be done as if the expression was an rvalue. Thus, it would prefer the move constructor.

OTOH, since you are using the ternary expression, none of the conditions hold and you are stuck with a regular copy. Changing your code to

if(b)
  return x;
return y;

calls the move constructor.

Note that there is a distinction between RVO and copy elision - copy elision is what the standard allows, while RVO is a technique commonly used to elide copies in a subset of the cases where the standard allows copy elision.

When the move constructor is actually called if we have (N)RVO?

First, you should probably make sure that Foo follows the rule of three/five and has move/copy assignment operators. And it is good practice for the move-constructor and move-assignment operator to be noexcept:

struct Foo {
  Foo()                           { std::cout << "Constructed\n"; }
  Foo(const Foo &)                { std::cout << "Copy-constructed\n"; }
  Foo& operator=(const Foo&)      { std::cout << "Copy-assigned\n"; return *this; }
  Foo(Foo &&)            noexcept { std::cout << "Move-constructed\n"; }
  Foo& operator=(Foo &&) noexcept { std::cout << "Move-assigned\n"; return *this; }

  ~Foo()                    { std::cout << "Destructed\n"; }
};

In most cases you can follow the rule of zero and don't actually need to define any of these special member functions, the compiler will create them for you, but it is useful for this purpose.

(N)RVO is only for function return values. It does not apply, for example, for function parameters. Of course the compiler can apply whatever optimizations it likes under the "as-if" rule so we have to be careful when crafting trivial examples.

Function parameters

There are many cases where the move-constructor or move-assignment operator will be called. But a simple case is if you use std::move to transfer ownership to a function that accepts a parameter by-value or by rvalue-reference:

void takeFoo(Foo foo) {
  // use foo...
}

int main() { 
  Foo foo = makeFoo();

  // set data on foo...

  takeFoo(std::move(foo));
}

Output:

Constructed
Move-constructed
Destructed
Destructed

For use in standard library containers

A very useful case for the move-constructor is if you have a std::vector<Foo>. As you push_back objects into the container it occasionally has to re-allocate and move all the existing objects to new memory. If there is a valid move-constructor available on Foo it will use it instead of copying:

int main() { 
  std::vector<Foo> v;
  std::cout << "-- push_back 1 --\n";
  v.push_back(makeFoo());
  std::cout << "-- push_back 2 --\n";
  v.push_back(makeFoo());
}

Output:

-- push_back 1 --
Constructed
Move-constructed  <-- move new foo into container
Destructed        
-- push_back 2 --
Constructed
Move-constructed  <-- move existing foo to new memory
Move-constructed  <-- move new foo into container
Destructed
Destructed
Destructed
Destructed

Constructor member initializer lists

I find move-constructors useful in constructor member initializer lists. Say you have a class FooHolder that contains a Foo. Then you can define a constructor that takes a Foo by-value and moves it into the member variable:

class FooHolder {
  Foo foo_;
public:
  FooHolder(Foo foo) : foo_(std::move(foo)) {} 
};

int main() { 
  FooHolder fooHolder(makeFoo());
}

Output:

Constructed
Move-constructed
Destructed
Destructed

This is nice because it allows me to define a constructor that accepts lvalues or rvalues without unnecessary copies.

Cases that defeat NVRO

RVO always applies but there are cases that defeat NVRO. For example if you have two named variables and the choice of return variable is not known at compile time:

Foo makeFoo(double value) {
  Foo f1;
  Foo f2;
  if (value > 0.5)
    return f1;
  return f2;
}

Foo foo = makeFoo(value);

Output:

Constructed
Constructed
Move-constructed
Destructed
Destructed
Destructed

Or if the return variable is also a function parameter:

Foo appendToFoo(Foo foo) {

  // append to foo...

  return foo;
}

int main() { 
  Foo f1;
  Foo f2 = appendToFoo(f1);
}

Output:

Constructed
Copy-constructed
Move-constructed
Destructed
Destructed
Destructed

Optimizing setters for rvalues

One case for a move-assignment operator is if you want to optimize a setter for rvalues. Say you have a FooHolder that contains a Foo and you want a setFoo member function. Then if you want to optimize for both lvalues and rvalues you should have two overloads. One that takes a reference-to-const and another that takes an rvalue-reference:

class FooHolder {
  Foo foo_;
public:
  void setFoo(const Foo& foo) { foo_ = foo; }
  void setFoo(Foo&& foo) { foo_ = std::move(foo); }
};

int main() { 
  FooHolder fooHolder;  
  Foo f;
  fooHolder.setFoo(f);  // lvalue
  fooHolder.setFoo(makeFoo()); // rvalue
}

Output:

Constructed
Constructed
Copy-assigned  <-- setFoo with lvalue
Constructed
Move-assigned  <-- setFoo with rvalue
Destructed
Destructed
Destructed

Return value optimization (RVO) when using temporaries in C++

If copy elision is used, the X will only be constructed once.

Even if copy elision is disabled (e.g. passing the -fno-elide-constructors to gcc), the move constructors will be used automatically.

You can visualize it yourself:

#include <cstdio>

struct X
{
    X() { printf("default constructed\n"); }
    ~X() { printf("destructed\n"); }
    X(const X&) { printf("copy constructed\n"); }
    X(X&&) { printf("move constructed\n"); }
};

X A() {
    X x;
    return x;
}

void B(X x) {

}

int main() {
    B(A());
}

With RVO it prints


default constructed
destructed

With no RVO it prints


default constructed
move constructed
destructed
move constructed
destructed
destructed

C++11 rvalues and move semantics confusion (return statement)

First example

std::vector<int> return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return tmp;
}

std::vector<int> &&rval_ref = return_vector();

The first example returns a temporary which is caught by rval_ref. That temporary will have its life extended beyond the rval_ref definition and you can use it as if you had caught it by value. This is very similar to the following:

const std::vector<int>& rval_ref = return_vector();

except that in my rewrite you obviously can't use rval_ref in a non-const manner.

Second example

std::vector<int>&& return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return std::move(tmp);
}

std::vector<int> &&rval_ref = return_vector();

In the second example you have created a run time error. rval_ref now holds a reference to the destructed tmp inside the function. With any luck, this code would immediately crash.

Third example

std::vector<int> return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return std::move(tmp);
}

std::vector<int> &&rval_ref = return_vector();

Your third example is roughly equivalent to your first. The std::move on tmp is unnecessary and can actually be a performance pessimization as it will inhibit return value optimization.

The best way to code what you're doing is:

Best practice

std::vector<int> return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return tmp;
}

std::vector<int> rval_ref = return_vector();

I.e. just as you would in C++03. tmp is implicitly treated as an rvalue in the return statement. It will either be returned via return-value-optimization (no copy, no move), or if the compiler decides it can not perform RVO, then it will use vector's move constructor to do the return. Only if RVO is not performed, and if the returned type did not have a move constructor would the copy constructor be used for the return.

What Optimization Does Move Semantics Provide If We Already Have Rvo