Why Is C++11'S Pod "Standard Layout" Definition the Way It Is

Why is C++11's POD standard layout definition the way it is?

You are allowed to cast a standard layout class object address to a pointer to its first member and back by one of the later paragraphs, which is also often done in C:

struct A { int x; };
A a;

// "px" is guaranteed to point to a.x
int *px = (int*) &a;

// guaranteed to point to a
A *pa = (A*)px;

For that to work, the first member and the complete object have to have the same address (the compiler cannot adjust the int pointer by any bytes because it can't know whether it's a member of an A or not).

Finally, what would go wrong if more than one constituent class had data members?

Within a class, members are allocated in increasing addresses according to the declaration order. However C++ doesn't dictate the order of allocation for data members across classes. If both the derived class and base class had data members, the Standard doesn't define an order for their addresses on purpose, so as to give an implementation full flexibility in layouting memory. But for the above cast to work, you need to know what is the "first" member in allocation order!

What would go wrong if the first data member was also a base class?

If the base class has the same type as the first data member, implementations that place the base classes before the derived class objects in memory would need to have a padding byte before the derived class object data members in memory (base class would have size one), to avoid having the same address for both the base class and the first data member (in C++, two distinct objects of the same type always have different addresses). But that would again make impossible to cast the address of the derived class object to the type of its first data member.

Why does the C++ standard specifically grant leeway regarding memory layout of class data members with different access specifiers?

N2062 is the first C++ paper that deals with changes to C++98/03's POD definition. It was written as a means to resolve core issue 568, which is about PODs and type layouts. It represents the beginning of the design that leads to C++11's standard layout and trivial copyability definitions.

And yet, N2062 never even considers defining the layout of members with different access controls. It doesn't even give justification for why this restriction is in place. Nor does the final version of that proposal, which actually gives us trivially-copyable and standard-layout definitions. All versions of these proposals take the access control limitation as an fait accompli, rather than something that could have been changed.

All this suggests that the writer of the proposal had knowledge of at least one compiler/ABI that changes the order of members based on access controls.

Are members of a POD-struct or standard layout type guaranteed to be aligned according to their alignment requirements?

Each element of a POD struct is itself an object, and objects can only be allocated in accordance with the alignment requirements for those objects. The alignment requirements may change, though, due to the fact that something is a sub-object of another object ([basic.align]/1, 2:

1 Object types have alignment requirements (3.9.1, 3.9.2) which place restrictions on the addresses at which an object of that type may be allocated. An alignment is an implementation-defined integer value representing the number of bytes between successive addresses at which a given object can be allocated. An object type imposes an alignment requirement on every object of that type; stricter alignment can be requested using the alignment specifier (7.6.2).

2 A fundamental alignment is represented by an alignment less than or equal to the greatest alignment supported by the implementation in all contexts, which is equal to alignof(std::max_align_t) (18.2). The alignment required for a type might be different when it is used as the type of a complete object and when it is used as the type of a subobject. [Example:
struct B { long double d; };
struct D : virtual B { char c; }
When D is the type of a complete object, it will have a subobject of type B, so it must be aligned appropriately for a long double. If D appears as a subobject of another object that also has B as a virtual base class, the
B subobject might be part of a different subobject, reducing the alignment requirements on the D subobject.—end example ] The result of the alignof operator reflects the alignment requirement of the type in the
complete-object case.

[emphasis added]

Although the examples refer to a sub-object via inheritance, the normative wording just refers to sub-objects in general, so I believe the same rules apply, so on one hand you can assume that each sub-object is aligned so that it can be accessed. On the other hand, no you can't necessarily assume that will be the same alignment that alignof gives you.

[The reference is from N4296, but I believe the same applies to all recent versions. C++98/03, of course, didn't have alignof at all, but I believe the same basic principle applies--members will be aligned so they can be used, but that alignment requirement isn't necessarily the same as when they're used as independent objects.]

Standard Layout c++

The reason is that standard layout types effectively mandate the "empty base class optimization" where base classes with no data members take up no space and have the same address as the first data member (if any) of the derived class.

However, attempting doing this when the base class has the same type as the first data member violates the C++ memory model which requires that distinct objects of the same type must have distinct addresses.

From ISO/IEC 14882:2011 1.8 [intro.object]/6:

Two objects that are not bit-fields may have the same address if one is a subobject of the other, or if at least one is a base class subobject of zero size and they are of different types; otherwise, they shall have distinct addresses

effectively mandating the empty base class, 9.2 [class.mem] /20:

A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its
initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.

It would be impossible for the following types (Type1 and Type2) to be layout-compatible (although they would otherwise be standard-layout classes) without this restriction.

struct S1 {};
struct S2 {};

struct Type1 : S1 {
    S1 s;
    int k;
};

struct Type2 : S1 {
    S2 s;
    int m;
};

Why must members of standard layout classes have the same access control?

Access control very much does affect layout: Within one access contol level, the addresses of non-static data members increase in declaration order, but there is no requirement on addresses of different access levels with respect to one another.

Since standard layout is about addresses of members, the requirement ensures that all member addresses are in a well-defined order.

Why a class with references does not adhere to standard_layout?

The point of the C++ standard's concept of a standard layout class is that an instance of such a class can be reliably accessed as or copied to bytes, which, as the C++11 standard notes in its §9/9, makes such a class

” useful for communicating with code written in other programming languages

However, the C++ standard does not require a reference to use storage at all. It's not an object. You can't take its address. So it can't be (reliably) copied to bytes, or accessed as bytes. And so it's not compatible with the notion of standard layout classes.

In the formal,

C++11 §9/7:

” A standard-layout class is a class that:

— has no non-static data members of type non-standard-layout class (or array of such types) or reference,

Why Is C++11'S Pod "Standard Layout" Definition the Way It Is