Can Std::Launder Be Used to Convert an Object Pointer to Its Enclosing Array Pointer

Can std::launder be used to convert an object pointer to its enclosing array pointer?

This depends on whether the enclosing array object is a complete object, and if not, whether you can validly access more bytes through a pointer to that enclosing array object (e.g., because it's an array element itself, or pointer-interconvertible with a larger object, or pointer-interconvertible with an object that's an array element). The "reachable" requirement means that you cannot use launder to obtain a pointer that would allow you to access more bytes than the source pointer value allows, on pain of undefined behavior. This ensures that the possibility that some unknown code may call launder does not affect the compiler's escape analysis.

I suppose some examples could help. Each example below reinterpret_casts a int* pointing to the first element of an array of 10 ints into a int(*)[10]. Since they are not pointer-interconvertible, the reinterpret_cast does not change the pointer value, and you get a int(*)[10] with the value of "pointer to the first element of (whatever the array is)". Each example then attempts to obtain a pointer to the entire array by calling std::launder on the cast pointer.

int x[10];
auto p = std::launder(reinterpret_cast<int(*)[10]>(&x[0]));

This is OK; you can access all elements of x through the source pointer, and the result of the launder doesn't allow you to access anything else.

int x2[2][10];
auto p2 = std::launder(reinterpret_cast<int(*)[10]>(&x2[0][0]));

This is undefined. You can only access elements of x2[0] through the source pointer, but the result (which would be a pointer to x2[0]) would have allowed you to access x2[1], which you can't through the source.

struct X { int a[10]; } x3, x4[2]; // assume no padding
auto p3 = std::launder(reinterpret_cast<int(*)[10]>(&x3.a[0])); // OK

This is OK. Again, you can't access through a pointer to x3.a any byte you can't access already.

auto p4 = std::launder(reinterpret_cast<int(*)[10]>(&x4[0].a[0])); 

This is (intended to be) undefined. You would have been able to reach x4[1] from the result because x4[0].a is pointer-interconvertible with x4[0], so a pointer to the former can be reinterpret_cast to yield a pointer to the latter, which then can be used for pointer arithmetic. See https://wg21.link/LWG2859.

struct Y { int a[10]; double y; } x5;
auto p3 = std::launder(reinterpret_cast<int(*)[10]>(&x5.a[0]));

And this is again undefined, because you would have been able to reach x5.y from the resulting pointer (by reinterpret_cast to a Y*) but the source pointer can't be used to access it.

How to interpret the precondition of std::launder?

[basic.compound]/3 is not relevant. It specifically says that it applies only for the purpose of pointer arithmetic and comparison. There doesn't actually exist an array for the object.

I think when you call std::launder, there are four objects at the relevant address: obj1, obj1.n, obj2 and obj2.n.
obj1 and obj1.n are pointer-interconvertible, as are obj2 and obj2.n. Other combinations aside from identical pairs, are not pointer-interconvertible. There are no array objects and therefore "or the immediately-enclosing array object if Z is an array element." isn't relevant.

When considering reachability from std::launder(p), which points to obj2 thus only obj2 and obj2.n need to be considered as Z in the quote. obj2.n occupies an (improper) subset of bytes of obj2, so it is not relevant. The bytes reachable are those in obj2. Except that I considered obj2.n specifically, this is a rephrasing of your considerations.

By exactly the same reasoning, the bytes reachable from p (pointing to obj1) are all those in obj1.

obj1 and obj2 have the same size and therefore occupy exactly the same bytes. Therefore std::launder(p) would not make any bytes reachable that aren't reachable from p.

reachability with std::launder

The reachability condition basically asks whether it is possible to access a given byte of memory via pointer arithmetic and reinterpret_cast from a given pointer. The technical definition, which is effectively the same, is given on the linked cppreference page:

(bytes are reachable through a pointer that points to an object Y if those bytes are within the storage of an object Z that is pointer-interconvertible with Y, or within the immediately enclosing array of which Z is an element)

x2 is an array of 2 arrays of 10 arrays of int. Let's suppose we call the two arrays a and b.

&x2[0][0] is an int* pointing to the first element of a.

&x2[0][0] + 10 is an int* pointer one-past the last element a. The address of this pointer value is also the address at which b begins. However one cannot obtain a pointer to b or one of its elements via reinterpret_cast since &x2[0][0] + 10 doesn't point to any object that is pointer-interconvertible with b or one of its elements.

In terms of the technical definition, the only object pointer-interconvertible with the first element of a is the object itself. Therefore the reachable bytes from a pointer to the first element of a are only the bytes of a, which is the array immediately enclosing this object.

Therefore the reachable bytes through &x2[0][0] are only those of the array a, not including b. E.g. *(&x2[0][0] + 10) = 123; has undefined behavior.

However if std::launder were to return a pointer to a (of type int(*)[10]), then (p2+1)[i] would be a way to access all elements of b. Or in terms of the technical definition, the array immediately enclosing the object p2 points to would be x2, so that all bytes of x2 are reachable.

This means after the std::launder bytes that weren't reachable before would become reachable. Therefore the call has undefined behavior.

Using std::launder to get a pointer to an active object member from a pointer to an inactive object?

Let's consider that the ABI is well specified and that we know that size_r[7] is at the same address as short_str[15]

It depends entirely on what that guarantee means exactly.

A compiler is free to guarantee that

Sso.short_str[15]

can be accessed and modified and everything even when Sso.large_str is currently active, and get exactly the semantics you expect.

Or it is free not to give that guarantee.

There is no restriction on the behavior or programs that are ill-formed or exhibit undefined behavior.

As there is no object there, &Sso.short_str[15] isn't pointer-interconvertible with anything. An object that isn't there doesn't have the "same address" as another object.

Launder is defined in terms of a pointer to a pre-existing object. That pointer is then destroyed, and a new object with the same address is created (which is well defined). std::launder then lets you take the pointer to the object that no longer exists and get a pointer to the existing object.

What you are doing is not that. If you took &short_str[15] when it was engaged, you'd have a pointer to an object. And the ABI could say that this was at the same address as size_r[7]. And now std::launder would be in the domain of validity.

But the compiler could just go a step further and define that short_str[15] refers to the same object as size_r[7] even if it isn't active.

The weakest ABI guarantee that I could see being consistent with your stuff would only work if you took the address of short_str[15] when it was active; later, you would engage the large_str, and then you could launder from &short_str[15] to &size_r[7]. The strongest ABI guarantee that is consistent with your statement makes the call to std::launder not required. Somewhere in the middle std::launder would be required.

Wording of array-to-pointer conversion and undefined behaviour

The paragraph is just in general imprecise, in my opinion. It doesn't say what "the array" refers to at all. No array has been introduced before, only array types.

I guess it should probably state explicitly that it refers to the array object result of the glvalue, after temporary materialization if applicable.

Then I think it should also have a requirement that the type of that result object be similar to that of the original expression type. That way the result object will always be an array object and it can't have a "wrong" type in the same sense as for pointer arithmetic which already applies in a manual &pa[0] "decay". (see [expr.add]/6)

But that is my own interpretation of what seems a reasonable interpretation/improvement. I don't think the current wording makes that clear.


This is not the only part of the standard with imprecise wording in regard to lvalue type mismatches like this. See for example CWG 2535 for a similar situation with member access, where a similar resolution is suggested.



Related Topics



Leave a reply



Submit