Copy Constructor VS. Return Value Optimization

What are copy elision and return value optimization?

Introduction

For a technical overview - skip to this answer.

For common cases where copy elision occurs - skip to this answer.

Copy elision is an optimization implemented by most compilers to prevent extra (potentially expensive) copies in certain situations. It makes returning by value or pass-by-value feasible in practice (restrictions apply).

It's the only form of optimization that elides (ha!) the as-if rule - copy elision can be applied even if copying/moving the object has side-effects.

The following example taken from Wikipedia:

struct C {
  C() {}
  C(const C&) { std::cout << "A copy was made.\n"; }
};
 
C f() {
  return C();
}
 
int main() {
  std::cout << "Hello World!\n";
  C obj = f();
}

Depending on the compiler & settings, the following outputs are all valid:

Hello World!

A copy was made.

A copy was made.

Hello World!

A copy was made.

Hello World!

This also means fewer objects can be created, so you also can't rely on a specific number of destructors being called. You shouldn't have critical logic inside copy/move-constructors or destructors, as you can't rely on them being called.

If a call to a copy or move constructor is elided, that constructor must still exist and must be accessible. This ensures that copy elision does not allow copying objects which are not normally copyable, e.g. because they have a private or deleted copy/move constructor.

C++17: As of C++17, Copy Elision is guaranteed when an object is returned directly:

struct C {
  C() {}
  C(const C&) { std::cout << "A copy was made.\n"; }
};
 
C f() {
  return C(); //Definitely performs copy elision
}
C g() {
    C c;
    return c; //Maybe performs copy elision
}
 
int main() {
  std::cout << "Hello World!\n";
  C obj = f(); //Copy constructor isn't called
}

understanding copy constructor calls and named return value optimization

UPDATE: addressing the output of the updated program, using return rvo rather than return (rvo);

I am in constructor
I am in constructor
I am in destructor
I am in destructor

The reason you see this is that both objects (MyMethod::rvo and main::rvo) undergo default construction, then the latter is assigned to as a separate action but you're not logging that.

You can get a much better sense of what is going on by outputting the addresses of the objects, and the this pointer values as functions are called:

#include <cstdio>
#include <iostream>
class RVO
  {
    public:
    RVO(){
          printf("%p constructor\n", this); }
    RVO(const RVO& c_RVO) {
          printf("%p copy constructor, rhs %p\n", this, &c_RVO); }
    ~RVO(){
          printf("%p destructor\n", this); }
    int mem_var;
  };
  RVO MyMethod(int i)
  {
     RVO rvo;
     std::cout << "MyMethod::rvo @ " << &rvo << '\n';
     rvo.mem_var = i;
     return (rvo);
  }
  int main()
  {
        RVO rvo=MyMethod(5);
        std::cout << "main::rvo @ " << &rvo << '\n';
  }

The output will also depend on whether you compile with optimisations; you link to Microsoft documentation, so perhaps you're using the Microsoft compiler - try cl /O2.

Why is temporary not destroyed in Mymethod in the second version?

There was no temporary there - the object in main was directly copy-constructed. Stepping you through it:

002AFA4C constructor
MyMethod::rvo @ 002AFA4C   // MyMethod::rvo's constructed

002AFA70 copy constructor, rhs 002AFA4C   // above is copied to 2AFA70
002AFA4C destructor        // MyMethod::rvo's destructed
main::rvo @ 002AFA70       // turns out the copy above was directly to main::rvo
002AFA70 destructor        // main::rvo's destruction

[Alf's comment below] "directly copy-constructed" is not entirely meaningful to me. I think the OP means the rvo local variable

Consider the enhanced output from the first version of the program (without optimisation):

002FF890 constructor  // we find out this is main::rvo below
002FF864 constructor  // this one's MyMethod::rvo
MyMethod::rvo @ 002FF864
002FF888 copy constructor, rhs 002FF864  // 2FF888 is some temporary
002FF864 destructor   // there goes MyMethod::rvo
002FF888 destructor   // there goes the temporary
main::rvo @ 002FF890
002FF890 destructor   // and finally main::rvo

If we tie that back in to the OP's output and annotations...

I am in constructor       // main rvo construction
I am in constructor       //MyMethod rvo construction 
I am in copy constructor  //temporary created inside MyMethod
I am in destructor        //Destroying rvo in MyMethod
I am in destructor        //Destroying temporary in MyMethod
I am in destructor        //Destroying rvo of main

The OP (correctly) refers to the copy-constructed object as a temporary. When I say of the second version of the program "There was no temporary there - the object in main was directly copy-constructed." - I mean that there's no temporary equivalent to that in the first program we analysed directly above, and instead it's main::rvo that's copy-constructed from MyMethod::rvo.

Copy elision and return value optimization versus copy constructor

Yes. Copy elision can change the behavior of your code if your copy constructor (or your move constructor or your destructor) has side effects.

That's the whole point. If it could not change the behavior, there would be no reason to mention it in the standard. Optimizations which don't change behavior are already covered by the as-if rule. (1.9/1) That is:

The semantic descriptions in this International Standard define a
parameterized nondeterministic abstract machine. This International
Standard places no requirement on the structure of conforming
implementations. In particular, they need not copy or emulate the
structure of the abstract machine. Rather, conforming implementations
are required to emulate (only) the observable behavior of the abstract
machine as explained below.

Copy elision is explicitly mentioned in the standard precisely because it potentially violates this rule.

Return Value Optimization and private copy constructors

The basic problem is that return by value might copy. The C++ implementation is not required by the standard to apply copy-elision where it does apply. That's why the object still has to be copyable: so that the implementation's decision when to use it doesn't affect whether the code is well-formed.

Anyway, it doesn't necessarily apply to every copy that the user might like it to. For example there is no elision of copy assignment.

I think your options are:

implement a proper copy. If someone ends up with a slow program due to copying it then their profiler will tell them, you don't have to make it your job to stop them if you don't want to.
implement a proper move, but no copy (C++11 only).
change getFoo to take a Foo& (or maybe Foo*) parameter, and avoid a copy by somehow mutating their object. An efficient swap would come in handy for that. This is fairly pointless if getFoo really returns a default-constructed Foo as in your example, since the caller needs to construct a Foo before they call getFoo.
return a dynamically-allocated Foo wrapped in a smart pointer: either auto_ptr or unique_ptr. Functions defined to create an object and transfer sole ownership to their caller should not return shared_ptr since it has no release() function.
provide a copy constructor but make it blow up somehow (fail to link, abort, throw an exception) if it's ever used. The problems with this are (1) it's doomed to fail but the compiler says nothing, (2) you're enforcing quality of implementation, so your class doesn't work if someone deliberately disables RVO for whatever reason.

I may have missed some.

What is the magic in return value optimization on this?

Because you use C++17, which promises RVO, even you added -O0.
this maybe help

c++11 Return value optimization or move?

Use exclusively the first method:

Foo f()
{
  Foo result;
  mangle(result);
  return result;
}

This will already allow the use of the move constructor, if one is available. In fact, a local variable can bind to an rvalue reference in a return statement precisely when copy elision is allowed.

Your second version actively prohibits copy elision. The first version is universally better.

Why does Return Value Optimization not happen if no destructor is defined?

The language rule which allows this in case of returning a prvalue (the second example) is:

[class.temporary]
When an object of class type X is passed to or returned from a function, if X has at least one eligible copy or move constructor ([special]), each such constructor is trivial, and the destructor of X is either trivial or deleted, implementations are permitted to create a temporary object to hold the function parameter or result object.
The temporary object is constructed from the function argument or return value, respectively, and the function's parameter or return object is initialized as if by using the eligible trivial constructor to copy the temporary (even if that constructor is inaccessible or would not be selected by overload resolution to perform a copy or move of the object).
[Note: This latitude is granted to allow objects of class type to be passed to or returned from functions in registers.
— end note
]

Why does Return Value Optimization not happen [in some cases]?

The motivation for the rule is explained in the note of the quoted rule. Essentially, RVO is sometimes less efficient than no RVO.

If a destructor is defined by enabling the #if above, then the RVO does happen (and it also happens in some other cases such as defining a virtual method or adding a std::string member).

In the second case, this is explained by the rule because creating the temporary is only allowed when the destructor is trivial.

In the NRVO case, I suppose this is up to the language implementation.