What Does the Standard Library Guarantee About Self Move Assignment

What does the standard library guarantee about self move assignment?

17.6.4.9 Function arguments [res.on.arguments]

1 Each of the following applies to all arguments to functions defined
in the C++ standard library, unless explicitly stated otherwise.

...

If a function argument binds to an rvalue reference parameter, the implementation may assume that this parameter is a unique reference to
this argument. [ Note: If the parameter is a generic parameter of the
form T&& and an lvalue of type A is bound, the argument binds to an
lvalue reference (14.8.2.1) and thus is not covered by the previous
sentence. — end note ] [ Note: If a program casts an lvalue to an
xvalue while passing that lvalue to a library function (e.g. by
calling the function with the argument move(x)), the program is
effectively asking that function to treat that lvalue as a temporary.
The implementation is free to optimize away aliasing checks which
might be needed if the argument was anlvalue. —endnote]

So, the implementation of std::vector<T, A>::operator=(vector&& other) is allowed to assume that other is a prvalue. And if other is a prvalue, self-move-assignment is not possible.

What is likely to happen:

v will be left in a resource-less state (0 capacity). If v already has 0 capacity, then this will be a no-op.

Update

The latest working draft, N4618 has been modified to clearly state that in the MoveAssignable requirements the expression:

t = rv

(where rv is an rvalue), t need only be the equivalent value of rv prior to the assignment if t and rv do not reference the same object. And regardless, rv's state is unspecified after the assignment. There is an additional note for further clarification:

rv must still meet the requirements of the library component that is using it, whether or not t and rv refer to the same object.

What is the rationale for self-assignment-unsafe move assignment operators in the standard library?

Moved from objects in std are supposed to be discarded or assigned to prior to being reused. Anything that is not completely free beyond that is not promised.

Sometimes things are free. Like a move-constructed-from container is empty. Note that some move-assiged-from cases have no such guarantee, as some implementations may choose to move elements instead of buffers. Why the difference? One was a free extra guarantee, the other not.

A branch or other check is not completely free. It takes up a branch prediction slot, and even if predicted is merely almost free.

On top of that, a = std::move(a); is evidence of a logic error. Assign-from a (within std) means you will only assign-to or discard a. Yet here you are wanting it to have specific state on the next line. Either you know you are self-assigning, or you do not. If you do not, you are now moving from object you are also populating and you do not know it.

The principle of "do little things to keep things safe" clashes with "you do not pay for that which you do not use". In this case, the second won.

What does the standard say about move-assignment to self in the case of MoveAssignable?

What does "rv must still meet the requirements of the library component that is using it" refer to?

This is a roundabout way of saying that even though rv's state is unspecified, it must be a valid state. For instance, a type which records an "I'm gone!" state and throws exceptions for subsequent re-assignments is not allowed, if assignment is one of the required-to-be-supported operations.

What if t & rv refer to the same object?

You already quoted the answer to that: "rv's state is unspecified." This is independent of whether t refers to the same object.

Does the description mean:

or the type should handle this case somehow (no crashing allowed), but its result doesn't matter

Yes, this, pretty much.

Is it allowed to self-move an object in C++?

From cpp ref:

Also, the standard library functions called with xvalue arguments may assume the argument is the only reference to the object; if it was constructed from an lvalue with std::move, no aliasing checks are made. However, self-move-assignment of standard library types is guaranteed to place the object in a valid (but usually unspecified) state:

std::vector<int> v = {2, 3, 3};
v = std::move(v); // the value of v is unspecified

Behaviour of move-assignment to self

§17.6.4.9 [res.on.arguments]:

Each of the following applies to all arguments to functions defined in the C++ standard library, unless explicitly stated otherwise.
[...]
If a function argument binds to an rvalue reference parameter, the implementation may assume that
this parameter is a unique reference to this argument. [ Note: If the parameter is a generic parameter of the form T&& and an lvalue of type A is bound, the argument binds to an lvalue reference (14.8.2.1) and thus is not covered by the previous sentence. — end note ] [ Note: If a program casts an lvalue to an xvalue while passing that lvalue to a library function (e.g. by calling the function with the argument move(x)), the program is effectively asking that function to treat that lvalue as a temporary. The implementation is free to optimize away aliasing checks which might be needed if the argument was an lvalue. — end note ]

Since "the implementation may assume that this parameter is a unique reference to this argument", and self-move-assignment would violate this assumption, it has undefined behavior.

C++20 std::move in self assignment

This is not a self-assignment.

A self-assignment would be init = std::move(init);, and you have init = std::move(init) + *first;.

Move assignment operator and `if (this != &rhs)`

First, the Copy and Swap is not always the correct way to implement Copy Assignment. Almost certainly in the case of dumb_array, this is a sub-optimal solution.

The use of Copy and Swap is for dumb_array is a classic example of putting the most expensive operation with the fullest features at the bottom layer. It is perfect for clients who want the fullest feature and are willing to pay the performance penalty. They get exactly what they want.

But it is disastrous for clients who do not need the fullest feature and are instead looking for the highest performance. For them dumb_array is just another piece of software they have to rewrite because it is too slow. Had dumb_array been designed differently, it could have satisfied both clients with no compromises to either client.

The key to satisfying both clients is to build the fastest operations in at the lowest level, and then to add API on top of that for fuller features at more expense. I.e. you need the strong exception guarantee, fine, you pay for it. You don't need it? Here's a faster solution.

Let's get concrete: Here's the fast, basic exception guarantee Copy Assignment operator for dumb_array:

dumb_array& operator=(const dumb_array& other)
{
    if (this != &other)
    {
        if (mSize != other.mSize)
        {
            delete [] mArray;
            mArray = nullptr;
            mArray = other.mSize ? new int[other.mSize] : nullptr;
            mSize = other.mSize;
        }
        std::copy(other.mArray, other.mArray + mSize, mArray);
    }
    return *this;
}

Explanation:

One of the more expensive things you can do on modern hardware is make a trip to the heap. Anything you can do to avoid a trip to the heap is time & effort well spent. Clients of dumb_array may well want to often assign arrays of the same size. And when they do, all you need to do is a memcpy (hidden under std::copy). You don't want to allocate a new array of the same size and then deallocate the old one of the same size!

Now for your clients who actually want strong exception safety:

template <class C>
C&
strong_assign(C& lhs, C rhs)
{
    swap(lhs, rhs);
    return lhs;
}

Or maybe if you want to take advantage of move assignment in C++11 that should be:

template <class C>
C&
strong_assign(C& lhs, C rhs)
{
    lhs = std::move(rhs);
    return lhs;
}

If dumb_array's clients value speed, they should call the operator=. If they need strong exception safety, there are generic algorithms they can call that will work on a wide variety of objects and need only be implemented once.

Now back to the original question (which has a type-o at this point in time):

Class&
Class::operator=(Class&& rhs)
{
    if (this == &rhs)  // is this check needed?
    {
       // ...
    }
    return *this;
}

This is actually a controversial question. Some will say yes, absolutely, some will say no.

My personal opinion is no, you don't need this check.

Rationale:

When an object binds to an rvalue reference it is one of two things:

A temporary.
An object the caller wants you to believe is a temporary.

If you have a reference to an object that is an actual temporary, then by definition, you have a unique reference to that object. It can't possibly be referenced by anywhere else in your entire program. I.e. this == &temporary is not possible.

Now if your client has lied to you and promised you that you're getting a temporary when you're not, then it is the client's responsibility to be sure that you don't have to care. If you want to be really careful, I believe that this would be a better implementation:

Class&
Class::operator=(Class&& other)
{
    assert(this != &other);
    // ...
    return *this;
}

I.e. If you are passed a self reference, this is a bug on the part of the client that should be fixed.

For completeness, here is a move assignment operator for dumb_array:

dumb_array& operator=(dumb_array&& other)
{
    assert(this != &other);
    delete [] mArray;
    mSize = other.mSize;
    mArray = other.mArray;
    other.mSize = 0;
    other.mArray = nullptr;
    return *this;
}

In the typical use case of move assignment, *this will be a moved-from object and so delete [] mArray; should be a no-op. It is critical that implementations make delete on a nullptr as fast as possible.

Caveat:

Some will argue that swap(x, x) is a good idea, or just a necessary evil. And this, if the swap goes to the default swap, can cause a self-move-assignment.

I disagree that swap(x, x) is ever a good idea. If found in my own code, I will consider it a performance bug and fix it. But in case you want to allow it, realize that swap(x, x) only does self-move-assignemnet on a moved-from value. And in our dumb_array example this will be perfectly harmless if we simply omit the assert, or constrain it to the moved-from case:

dumb_array& operator=(dumb_array&& other)
{
    assert(this != &other || mSize == 0);
    delete [] mArray;
    mSize = other.mSize;
    mArray = other.mArray;
    other.mSize = 0;
    other.mArray = nullptr;
    return *this;
}

If you self-assign two moved-from (empty) dumb_array's, you don't do anything incorrect aside from inserting useless instructions into your program. This same observation can be made for the vast majority of objects.

<Update>

I've given this issue some more thought, and changed my position somewhat. I now believe that assignment should be tolerant of self assignment, but that the post conditions on copy assignment and move assignment are different:

For copy assignment:

x = y;

one should have a post-condition that the value of y should not be altered. When &x == &y then this postcondition translates into: self copy assignment should have no impact on the value of x.

For move assignment:

x = std::move(y);

one should have a post-condition that y has a valid but unspecified state. When &x == &y then this postcondition translates into: x has a valid but unspecified state. I.e. self move assignment does not have to be a no-op. But it should not crash. This post-condition is consistent with allowing swap(x, x) to just work:

template <class T>
void
swap(T& x, T& y)
{
    // assume &x == &y
    T tmp(std::move(x));
    // x and y now have a valid but unspecified state
    x = std::move(y);
    // x and y still have a valid but unspecified state
    y = std::move(tmp);
    // x and y have the value of tmp, which is the value they had on entry
}

The above works, as long as x = std::move(x) doesn't crash. It can leave x in any valid but unspecified state.

I see three ways to program the move assignment operator for dumb_array to achieve this:

dumb_array& operator=(dumb_array&& other)
{
    delete [] mArray;
    // set *this to a valid state before continuing
    mSize = 0;
    mArray = nullptr;
    // *this is now in a valid state, continue with move assignment
    mSize = other.mSize;
    mArray = other.mArray;
    other.mSize = 0;
    other.mArray = nullptr;
    return *this;
}

The above implementation tolerates self assignment, but *this and other end up being a zero-sized array after the self-move assignment, no matter what the original value of *this is. This is fine.

dumb_array& operator=(dumb_array&& other)
{
    if (this != &other)
    {
        delete [] mArray;
        mSize = other.mSize;
        mArray = other.mArray;
        other.mSize = 0;
        other.mArray = nullptr;
    }
    return *this;
}

The above implementation tolerates self assignment the same way the copy assignment operator does, by making it a no-op. This is also fine.

dumb_array& operator=(dumb_array&& other)
{
    swap(other);
    return *this;
}

The above is ok only if dumb_array does not hold resources that should be destructed "immediately". For example if the only resource is memory, the above is fine. If dumb_array could possibly hold mutex locks or the open state of files, the client could reasonably expect those resources on the lhs of the move assignment to be immediately released and therefore this implementation could be problematic.

The cost of the first is two extra stores. The cost of the second is a test-and-branch. Both work. Both meet all of the requirements of Table 22 MoveAssignable requirements in the C++11 standard. The third also works modulo the non-memory-resource-concern.

All three implementations can have different costs depending on the hardware: How expensive is a branch? Are there lots of registers or very few?

The take-away is that self-move-assignment, unlike self-copy-assignment, does not have to preserve the current value.

</Update>

One final (hopefully) edit inspired by Luc Danton's comment:

If you're writing a high level class that doesn't directly manage memory (but may have bases or members that do), then the best implementation of move assignment is often:

Class& operator=(Class&&) = default;

This will move assign each base and each member in turn, and will not include a this != &other check. This will give you the very highest performance and basic exception safety assuming no invariants need to be maintained among your bases and members. For your clients demanding strong exception safety, point them towards strong_assign.

What Does the Standard Library Guarantee About Self Move Assignment