What Is the Purpose of Std::Launder

What is the purpose of std::launder?

std::launder is aptly named, though only if you know what it's for. It performs memory laundering.

Consider the example in the paper:

struct X { const int n; };
union U { X x; float f; };
...

U u = {{ 1 }};

That statement performs aggregate initialization, initializing the first member of U with {1}.

Because n is a const variable, the compiler is free to assume that u.x.n shall always be 1.

So what happens if we do this:

X *p = new (&u.x) X {2};

Because X is trivial, we need not destroy the old object before creating a new one in its place, so this is perfectly legal code. The new object will have its n member be 2.

So tell me... what will u.x.n return?

The obvious answer will be 2. But that's wrong, because the compiler is allowed to assume that a truly const variable (not merely a const&, but an object variable declared const) will never change. But we just changed it.

[basic.life]/8 spells out the circumstances when it is OK to access the newly created object through variables/pointers/references to the old one. And having a const member is one of the disqualifying factors.

So... how can we talk about u.x.n properly?

We have to launder our memory:

assert(*std::launder(&u.x.n) == 2); //Will be true.

Money laundering is used to prevent people from tracing where you got your money from. Memory laundering is used to prevent the compiler from tracing where you got your object from, thus forcing it to avoid any optimizations that may no longer apply.

Another of the disqualifying factors is if you change the type of the object. std::launder can help here too:

alignas(int) char data[sizeof(int)];
new(&data) int;
int *p = std::launder(reinterpret_cast<int*>(&data));

[basic.life]/8 tells us that, if you allocate a new object in the storage of the old one, you cannot access the new object through pointers to the old. launder allows us to side-step that.

Why introduce `std::launder` rather than have the compiler take care of it?

depending on whether or not the code does something like in these examples

Because the compiler cannot always know when data is being accessed "this way".

As things currently stand, the compiler is allowed to assume that, for the following code:

struct foo{ int const x; };

void some_func(foo*);

int bar() {
foo f { 123 };
some_func(&f);
return f.x;
}

bar will always return 123. The compiler may generate code that actually accesses the object. But the object model does not require this. f.x is a const object (not a reference/pointer to const), and therefore it cannot be changed. And f is required to always name the same object (indeed, these are the parts of the standard you would have to change). Therefore, the value of f.x cannot be changed by any non-UB means.

Why is it reasonable to have this pseudo-function for punching a hole in formal language semantics

This was actually discussed. That paper brings up how long these issues have existed (ie: since C++03) and often optimizations made possible by this object model have been employed.

The proposal was rejected on the grounds that it would not actually fix the problem. From this trip report:

However, during discussion it came to light that the proposed alternative would not handle all affected scenarios (particularly scenarios where vtable pointers are in play), and it did not gain consensus.

The report doesn't go into any particular detail on the matter, and the discussions in question are not publicly available. But the proposal itself does point out that it wouldn't allow devirtualizing a second virtual function call, as the first call may have build a new object. So even P0532 would not make launder unnecessary, merely less necessary.

Where can I find what std::launder really does?

The purpose of std::launder is not to "suppress warnings" but to remove assumptions that the C++ compiler may have.

Aliasing warnings are trying to inform you that you are possibly doing things whose behaviour is not defined by the C++ standard.

The compiler can and does make assumptions that your code is only doing things defined by the standard. For example, it can assume that a pointer to a const value once constructed will not be changed.

Compilers may use that assumption to skip refetching the value from memory (and store it in a register), or even calculate its value at compile time and do dead-code elimination based on it. It can assume this, because any program where it is false is doing undefined behaviour, so any program behaviour is accepted under the C++ standard.

std::launder was crafted to permit you do things like take a pointer to a truly const value that was legally modified (by creating a new object in its storage, say) and use that pointer after the modification in a defined way (so it refers to the new object) and other specific and similar situations (don't assume it just "removes aliasing problems"). __builtin_launder is going to be a "noop" function in one sense, but in another sense it is going to change what kind of assembly code can be generated around it. With it, certain assumptions about what value can be reached from its input cannot be made about its output. And some code that would be UB on the input pointer is not UB on the output pointer.

It is an expert tool. I, personally, wouldn't use it without doing a lot of standard delving and double checking that I wasn't using it wrong. It was added because there were certain operations someone proved there was no way to reasonably do in a standard compliant way, and it permits a library writer to do it efficiently now.

reachability with std::launder

The reachability condition basically asks whether it is possible to access a given byte of memory via pointer arithmetic and reinterpret_cast from a given pointer. The technical definition, which is effectively the same, is given on the linked cppreference page:

(bytes are reachable through a pointer that points to an object Y if those bytes are within the storage of an object Z that is pointer-interconvertible with Y, or within the immediately enclosing array of which Z is an element)

x2 is an array of 2 arrays of 10 arrays of int. Let's suppose we call the two arrays a and b.

&x2[0][0] is an int* pointing to the first element of a.

&x2[0][0] + 10 is an int* pointer one-past the last element a. The address of this pointer value is also the address at which b begins. However one cannot obtain a pointer to b or one of its elements via reinterpret_cast since &x2[0][0] + 10 doesn't point to any object that is pointer-interconvertible with b or one of its elements.

In terms of the technical definition, the only object pointer-interconvertible with the first element of a is the object itself. Therefore the reachable bytes from a pointer to the first element of a are only the bytes of a, which is the array immediately enclosing this object.

Therefore the reachable bytes through &x2[0][0] are only those of the array a, not including b. E.g. *(&x2[0][0] + 10) = 123; has undefined behavior.

However if std::launder were to return a pointer to a (of type int(*)[10]), then (p2+1)[i] would be a way to access all elements of b. Or in terms of the technical definition, the array immediately enclosing the object p2 points to would be x2, so that all bytes of x2 are reachable.

This means after the std::launder bytes that weren't reachable before would become reachable. Therefore the call has undefined behavior.

How to interpret the precondition of std::launder?

[basic.compound]/3 is not relevant. It specifically says that it applies only for the purpose of pointer arithmetic and comparison. There doesn't actually exist an array for the object.

I think when you call std::launder, there are four objects at the relevant address: obj1, obj1.n, obj2 and obj2.n.
obj1 and obj1.n are pointer-interconvertible, as are obj2 and obj2.n. Other combinations aside from identical pairs, are not pointer-interconvertible. There are no array objects and therefore "or the immediately-enclosing array object if Z is an array element." isn't relevant.

When considering reachability from std::launder(p), which points to obj2 thus only obj2 and obj2.n need to be considered as Z in the quote. obj2.n occupies an (improper) subset of bytes of obj2, so it is not relevant. The bytes reachable are those in obj2. Except that I considered obj2.n specifically, this is a rephrasing of your considerations.

By exactly the same reasoning, the bytes reachable from p (pointing to obj1) are all those in obj1.

obj1 and obj2 have the same size and therefore occupy exactly the same bytes. Therefore std::launder(p) would not make any bytes reachable that aren't reachable from p.

Does the effect of std::launder last after the expression in which it is called?

cppereference is quite explicit about it:

std::launder has no effect on its argument. Its return value must be used to access the object. Thus, it's always an error to discard the return value.

As for the standard itself, nowhere does it state that its argument is also laundered (or not), but the signature of the function indicates that in my opinion: the pointer is taken by value, not by reference, thus it cannot be altered in any way visible to the caller.



Related Topics



Leave a reply



Submit