Why Do We Copy Then Move

Why do we copy then move?

Before I answer your questions, one thing you seem to be getting wrong: taking by value in C++11 does not always mean copying. If an rvalue is passed, that will be moved (provided a viable move constructor exists) rather than being copied. And std::string does have a move constructor.

Unlike in C++03, in C++11 it is often idiomatic to take parameters by value, for the reasons I am going to explain below. Also see this Q&A on StackOverflow for a more general set of guidelines on how to accept parameters.

Why aren't we taking an rvalue-reference to str?

Because that would make it impossible to pass lvalues, such as in:

std::string s = "Hello";
S obj(s); // s is an lvalue, this won't compile!

If S only had a constructor that accepts rvalues, the above would not compile.

Won't a copy be expensive, especially given something like std::string?

If you pass an rvalue, that will be moved into str, and that will eventually be moved into data. No copying will be performed. If you pass an lvalue, on the other hand, that lvalue will be copied into str, and then moved into data.

So to sum it up, two moves for rvalues, one copy and one move for lvalues.

What would be the reason for the author to decide to make a copy then a move?

First of all, as I mentioned above, the first one is not always a copy; and this said, the answer is: "Because it is efficient (moves of std::string objects are cheap) and simple".

Under the assumption that moves are cheap (ignoring SSO here), they can be practically disregarded when considering the overall efficiency of this design. If we do so, we have one copy for lvalues (as we would have if we accepted an lvalue reference to const) and no copies for rvalues (while we would still have a copy if we accepted an lvalue reference to const).

This means that taking by value is as good as taking by lvalue reference to const when lvalues are provided, and better when rvalues are provided.

P.S.: To provide some context, I believe this is the Q&A the OP is referring to.

Implicit move vs copy operations and containment

Keep in mind what it means to "move" data in C++ (assuming we follow the usual conventions). If you move object x to object y, then y receives all the data that was in x and x is... well, we don't care what x is as long as it is still valid for destruction. Often we think of x as losing all of its data, but that is not required. All that is required is that x is valid. If x ends up with the same data as y, we don't care.

Copying x to y causes y to receive all the data that was in x, and x is left in a valid state (assuming the copy operation follows conventions and is not buggy). Thus, copying counts as moving. The reason for defining move operations in addition to copy operations is not to permit something new, but to permit greater efficiency in some cases. Anything that can be copied can be moved unless you take steps to prevent moves.

So what I see is A is a non-moveable class because of the definition of its copy control operations so it can only be copied and any attempt to move an object of this class, the corresponding copy operation is used instead.

What I see is that A is a moveable class (despite the lack of move constructor and move assignment), because of the definition of its copy control operations. Any attempt to move an object of this class will fall back on the corresponding copy operation. If you want a class to be copyable but not movable, you need to delete the move operations, while retaining the copy ones. (Try it. Add A(A&&) = delete; to your definition of A.)

The B class has one member that can be moved or copied, and one member that can be moved but not copied. So B itself can be moved but not copied. When B is moved, the unique_ptr member will be moved as you expect, and the A member will be copied (the fallback for moving objects of type A).

Things get more worse for me: if I uncomment the lines of move operations in B, the initialization above will not compile complaining about referencing a deleted funtion, the same thing for the assignment!

Read the error message more closely. When I replicated this result, the "use of deleted function" error was followed by a note providing more details: the move constructor was deleted because "its exception-specification does not match the implicit exception-specification". Removing the noexcept keywords allowed the code to compile (using gcc 9.2 and 6.1).

Alternatively, you could add noexcept to the copy constructor and copy assignment of A (keeping noexcept on the move operations of B). This is one way to demonstrate that the default move operations of B use the copy operations of A.

What makes moving objects faster than copying?

As @gudok answered before, everything is in the implementation... Then a bit is in user code.

The implementation

Let's assume we're talking about the copy-constructor to assign a value to the current class.

The implementation you'll provide will take into account two cases:

the parameter is a l-value, so you can't touch it, by definition
the parameter is a r-value, so, implicitly, the temporary won't live much longer beyond you using it, so, instead of copying its content, you could steal its content

Both are implemented using an overload:

Box::Box(const Box & other)
{
   // copy the contents of other
}

Box::Box(Box && other)
{
   // steal the contents of other
}

The implementation for light classes

Let's say your class contains two integers: You can't steal those because they are plain raw values. The only thing that would seem like stealing would be to copy the values, then set the original to zero, or something like that... Which makes no sense for simple integers. Why do that extra work?

So for light value classes, actually offering two specific implementations, one for l-value, and one for r-values, makes no sense.

Offering only the l-value implementation will be more than enough.

The implementation for heavier classes

But in the case of some heavy classes (i.e. std::string, std::map, etc.), copying implies potentially a cost, usually in allocations. So, ideally, you want to avoid it as much as possible. This is where stealing the data from temporaries becomes interesting.

Assume your Box contains a raw pointer to a HeavyResource that is costly to copy. The code becomes:

Box::Box(const Box & other)
{
   this->p = new HeavyResource(*(other.p)) ; // costly copying
}

Box::Box(Box && other)
{
   this->p = other.p ; // trivial stealing, part 1
   other.p = nullptr ; // trivial stealing, part 2
}

It's plain one constructor (the copy-constructor, needing an allocation) is much slower than another (the move-constructor, needing only assignments of raw pointers).

When is it safe to "steal"?

The thing is: By default, the compiler will invoke the "fast code" only when the parameter is a temporary (it's a bit more subtle, but bear with me...).

Why?

Because the compiler can guarantee you can steal from some object without any problem only if that object is a temporary (or will be destroyed soon after anyway). For the other objects, stealing means you suddenly have an object that is valid, but in an unspecified state, which could be still used further down in the code. Possibly leading to crashes or bugs:

Box box3 = static_cast<Box &&>(box1); // calls the "stealing" constructor
box1.doSomething();         // Oops! You are using an "empty" object!

But sometimes, you want the performance. So, how do you do it?

The user code

As you wrote:

Box box1 = some_value;
Box box2 = box1;            // value of box1 is copied to box2 ... ok
Box box3 = std::move(box1); // ???

What happens for box2 is that, as box1 is a l-value, the first, "slow" copy-constructor is invoked. This is the normal, C++98 code.

Now, for box3, something funny happens: The std::move does return the same box1, but as a r-value reference, instead of a l-value. So the line:

Box box3 = ...

... will NOT invoke copy-constructor on box1.

It will invoke INSTEAD the stealing constructor (officially known as the move-constructor) on box1.

And as your implementation of the move constructor for Box does "steal" the content of box1, at the end of the expression, box1 is in a valid but unspecified state (usually, it will be empty), and box3 contains the (previous) content of box1.

What about the valid but unspecified state of a moved-out class?

Of course, writing std::move on a l-value means you make a promise you won't use that l-value again. Or you will do it, very, very carefully.

Quoting the C++17 Standard Draft (C++11 was: 17.6.5.15):

20.5.5.15 Moved-from state of library types [lib.types.movedfrom]

Objects of types defined in the C++ standard library may be moved from (15.8). Move operations may be explicitly specified or implicitly generated. Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state.

This was about the types in the standard library, but this is something you should follow for your own code.

What it means is that the moved-out value could now hold any value, from being empty, zero, or some random value. E.g. for all you know, your string "Hello" would become an empty string "", or become "Hell", or even "Goodbye", if the implementer feels it is the right solution. It still must be a valid string, though, with all its invariants respected.

So, in the end, unless the implementer (of a type) explicitly committed to a specific behavior after a move, you should act as if you know nothing about a moved-out value (of that type).

Conclusion

As said above, the std::move does nothing. It only tells the compiler: "You see that l-value? please consider it a r-value, just for a second".

So, in:

Box box3 = std::move(box1); // ???

... the user code (i.e. the std::move) tells the compiler the parameter can be considered as a r-value for this expression, and thus, the move constructor will be called.

For the code author (and the code reviewer), the code actually tells it is ok to steal the content of box1, to move it into box3. The code author will then have to make sure box1 is not used anymore (or used very very carefully). It is their responsibility.

But in the end, it is the implementation of the move constructor that will make a difference, mostly in performance: If the move constructor actually steals the content of the r-value, then you will see a difference. If it does anything else, then the author lied about it, but this is another problem...

Understanding the reasoning between copy/move constructors and operators

Congratulations, you found a core issue of C++!

There are still a lot of discussions around the behavior you see with your example code.

There are suggestions like:

A&& helper_alt(A a) {
    std::cout << ".." << std::endl;
    return std::move(a);
}

This will do what you want, simply use the move assignment but emits a warning from g++ "warning: reference to local variable 'a' returned", even if the variable goes immediately out of scope.

Already other people found that problem and this is already made a c++ standard language core issue

Interestingly the issue was already found in 2010 but not solved until now...

To give you an answer to your question "In the last case, why is the move constructor being called and then the move assignment operator, instead of just the move assignment operator?" is, that also C++ committee does not have an answer until now. To be precise, there is a proposed solution and this one is accepted but until now not part of the language.

From: Comment Status

Amend paragraph 34 to explicitly exclude function parameters from copy elision. Amend paragraph 35 to include function parameters as eligible for move-construction.

Advantages of pass-by-value and std::move over pass-by-reference

Did I understand correctly what is happening here?

Yes.

Is there any upside of using std::move over passing by reference and just calling m_name{name}?

An easy to grasp function signature without any additional overloads. The signature immediately reveals that the argument will be copied - this saves callers from wondering whether a const std::string& reference might be stored as a data member, possibly becoming a dangling reference later on. And there is no need to overload on std::string&& name and const std::string& arguments to avoid unnecessary copies when rvalues are passed to the function. Passing an lvalue

std::string nameString("Alex");
Creature c(nameString);

to the function that takes its argument by value causes one copy and one move construction. Passing an rvalue to the same function

std::string nameString("Alex");
Creature c(std::move(nameString));

causes two move constructions. In contrast, when the function parameter is const std::string&, there will always be a copy, even when passing an rvalue argument. This is clearly an advantage as long as the argument type is cheap to move-construct (this is the case for std::string).

But there is a downside to consider: the reasoning doesn't work for functions that assign the function argument to another variable (instead of initializing it):

void setName(std::string name)
{
    m_name = std::move(name);
}

will cause a deallocation of the resource that m_name refers to before it's reassigned. I recommend reading Item 41 in Effective Modern C++ and also this question.

Move vs copy performance in custom classes

How (if at all) does creating an instance of a class via move ctor
improve performance, compared to the copy ctor, if the members of said
class are mainly basic types like int etc. aren't those members just
copied like in the copy ctor?

In the case where all the member variables are by-value/POD, there shouldn't be any difference at all.

So when does move provide better performance, when dealing with
custom classes?

The move constructor provides an advantage only in the case where the newly-constructed object can "steal" resources from the already-existing object.

For example, imagine that you have a temporary std::string that contains the entire contents of the novel 'War and Peace' -- all 1440 pages of it.

In the classic copy-constructor case, if you wanted to assign that temporary string to a non-temporary std::string (e.g. a member-variable or global-variable or whatever), the program would have to perform the following steps:

Free any previous buffer that the destination std::string might have been holding
Allocate a new buffer that is (1440*chars_per_page) bytes long for the destination std::string to hold
Copy all 1440 pages of data from the temporary std::string's buffer to the destination std::string's buffer
Delete the temporary string's buffer (when the temporary string goes out of scope)

As you can see, this would be inefficient, as we copied a ton of data even though we never actually needed a second copy of the data. Since we have a move constructor implemented for std::string, however, C++11 programs can be smarter and just do this:

Free any previous buffer that the destination std::string might have been holding
Transfer the giant buffer from the temporary std::string to the destination std::string (note that all we do is copy the pointer-to-the-buffer value of the source string to the destination string; in particular we don't need to copy, or even read, any of the actual 1440-pages of data!)
Set the temporary string's pointer-to-the-buffer value to NULL (so it won't try to free it or use it later on)

... and that's it; instead of having to allocate a second big buffer and then copy gobs of data, we were able to achieve the desired end-state simply by swizzling a couple of pointer values. That's a big performance win, and we can get away with doing that because we know that the temporary-string is about to be deleted anyway, so there's no harm in "stealing" the internal data it was holding.

I don't know if std::string counts as a 'custom class' exactly, but you can use the same technique in your own classes any time you have a class that dynamically allocates internal state.

Why are copy operations deleted when move operations are declared?

When a class would be moved but for the fact that no move constructor is declared, the compiler falls back to copy constructor. In the same situation, if move constructor is declared as deleted, the program would be ill-formed. Thus, if move constructor were implicitly declared as deleted, a lot of reasonable code involving existing pre-C++11 classes would fail to compile. Things like myVector.push_back(MyClass())

This explains why move constructor cannot be implicitly declared deleted when copy constructor is defined. This leaves the question of why copy constructor is implicitly declared deleted when move constructor is defined.

I don't know the exact motivation of the committee, but I have a guess. If adding a move constructor to existing C++03-style class were to remove a (previously implicitly defined) copy constructor, then existing code using this class may change meaning in subtle ways, due to overload resolution picking unexpected overloads that used to be rejected as worse matches.

Consider:

struct C {
  C(int) {}
  operator int() { return 42; }
};

C a(1);
C b(a);  // (1)

This is a legacy C++03 class. (1) invokes an (implicitly defined) copy constructor. C b((int)a); is also viable, but is a worse match.

Imagine that, for whatever reason, I decide to add an explicit move constructor to this class. If the presence of move constructor were to suppress the implicit declaration of copy constructor, then a seemingly unrelated piece of code at (1) would still compile, but silently change its meaning: it would now invoke operator int() and C(int). That would be bad.

On the other hand, if copy constructor is implicitly declared as deleted, then (1) would fail to compile, alerting me to the problem. I would examine the situation and decide whether I still want a default copy constructor; if so, I would add C(const C&)=default;

What is std::move(), and when should it be used?

Wikipedia Page on C++11 R-value references and move constructors

In C++11, in addition to copy constructors, objects can have move constructors.

(And in addition to copy assignment operators, they have move assignment operators.)
The move constructor is used instead of the copy constructor, if the object has type "rvalue-reference" (Type &&).
std::move() is a cast that produces an rvalue-reference to an object, to enable moving from it.

It's a new C++ way to avoid copies. For example, using a move constructor, a std::vector could just copy its internal pointer to data to the new object, leaving the moved object in an moved from state, therefore not copying all the data. This would be C++-valid.

Try googling for move semantics, rvalue, perfect forwarding.

Why Do We Copy Then Move