Accessing Same-Type Inactive Member in Unions

Accessing same-type inactive member in unions

Yes you can read the other member in this particular case.

This is what the C++11/14 standard has to say:

9.5 - Unions

In a union, at most one of the non-static data members can be active
at any time, that is, the value of at most one of the non-static data
members can be stored in a union at any time.

But the note immediately after the section makes your particular instance legal since one special guarantee is made in order to simplify the use of unions:

[ Note: If a standard-layout union contains several standard-layout
structs that share a common initial sequence (9.2), and if an object
of this standard-layout union type contains one of the standard-layout
structs, it is permitted to inspect the common initial sequence of any
of standard-layout struct members; see 9.2. —end note ]

And your structs do share a common initial sequence:

9.2.16 - Class members

The common initial sequence of two standard-layout
struct (Clause 9) types is the longest sequence of non- static data
members and bit-fields in declaration order, starting with the first
such entity in each of the structs, such that corresponding entities
have layout-compatible types and either neither entity is a bit-field
or both are bit-fields with the same width.

Accessing inactive union member and undefined behavior?

The confusion is that C explicitly permits type-punning through a union, whereas C++ (c++11) has no such permission.

c11

6.5.2.3 Structure and union members


95) If the member used to read the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type
punning’’). This might be a trap representation.

The situation with C++:

c++11

9.5 Unions [class.union]


In a union, at most one of the non-static data members can be active at any time, that is, the value of at
most one of the non-static data members can be stored in a union at any time.

C++ later has language permitting the use of unions containing structs with common initial sequences; this doesn't however permit type-punning.

To determine whether union type-punning is allowed in C++, we have to search further. Recall that c99 is a normative reference for C++11 (and C99 has similar language to C11 permitting union type-punning):

3.9 Types [basic.types]


4 - The object representation of an object of type T is the sequence of N unsigned char objects taken up by
the object of type T, where N equals sizeof(T). The value representation of an object is the set of bits that
hold the value of type T. For trivially copyable types, the value representation is a set of bits in the object
representation that determines a value, which is one discrete element of an implementation-defined set of
values. 42

42) The intent is that the memory model of C++ is compatible with that of ISO/IEC 9899 Programming Language C.

It gets particularly interesting when we read

3.8 Object lifetime [basic.life]


The lifetime of an object of type T begins when:
— storage with the proper alignment and size for type T is obtained, and
— if the object has non-trivial initialization, its initialization is complete.

So for a primitive type (which ipso facto has trivial initialization) contained in a union, the lifetime of the object encompasses at least the lifetime of the union itself. This allows us to invoke

3.9.2 Compound types [basic.compound]


If an object of type T is located at an address A, a pointer of type cv T* whose value is the
address A is said to point to that object, regardless of how the value was obtained.

Assuming that the operation we are interested in is type-punning i.e. taking the value of a non-active union member, and given per the above that we have a valid reference to the object referred to by that member, that operation is lvalue-to-rvalue conversion:

4.1 Lvalue-to-rvalue conversion [conv.lval]


A glvalue of a non-function, non-array type T can be converted to a prvalue.
If T is an incomplete type, a program that necessitates this conversion is ill-formed. If the object to which the glvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior.

The question then is whether an object that is a non-active union member is initialized by storage to the active union member. As far as I can tell, this is not the case and so although if:

  • a union is copied into char array storage and back (3.9:2), or
  • a union is bytewise copied to another union of the same type (3.9:3), or
  • a union is accessed across language boundaries by a program element conforming to ISO/IEC 9899 (so far as that is defined) (3.9:4 note 42), then

the access to a union by a non-active member is defined and is defined to follow the object and value representation, access without one of the above interpositions is undefined behaviour. This has implications for the optimisations allowed to be performed on such a program, as the implementation may of course assume that undefined behaviour does not occur.

That is, although we can legitimately form an lvalue to a non-active union member (which is why assigning to a non-active member without construction is ok) it is considered to be uninitialized.

Accessing inactive union members

It's probably compiler specific due to how it arranges the underlying data in the union. Essentially accessing by an 'inactive' member is just interpreting the data differently. Interpreting a large int as a smaller one should work.

[FF|01] < a uint16

A uint8 just reads the first byte of that data:

[FF|01]
^ read
^ ignored

Interpreting a float as an int or viceversa is unlikely to work, since the underlying bits won't make sense:

[0x1|0xF|0x7FFFFF]
^ 23-bit mantissa
^ 8-bit exponenent
^ sign bit

Inspect inactive member of a union with common initial sequence in constexpr expression

The text that you quoted is correct, but there are additional constraints on accessing an inactive member of a union in a constexpr context. In particular, you are violating this rule:

An expression E is a core constant expression unless the evaluation of E, following the rules of the abstract machine ([intro.execution]), would evaluate one of the following:

an lvalue-to-rvalue conversion that is applied to a glvalue that refers to a non-active member of a union or a subobject thereof;


Note that you can change the active member of a union inside a constexpr context, so you can do this:

constexpr int fun(int n) 
{
type t(n);
t.b = t.a; // t.b is now the active member
return t.b; // ok, reading from active member is fine
}

I believe this is allowed only from c++20: demo.

The relevant rule is this:

An expression E is a core constant expression unless the evaluation of E, following the rules of the abstract machine ([intro.execution]), would evaluate one of the following:

an invocation of an implicitly-defined copy/move constructor or copy/move assignment operator for a union whose active member (if any) is mutable, unless the lifetime of the union object began within the evaluation of E;

(emphasis is mine). Since the lifetime of t begins inside the evaluation of fun, this is allowed.

C++ Union Member Access And Undefined Behaviour

Is reading myHeader.wId in the line packetIdFlipped = myHeader.wId << 8 undefined behaviour?

Yes. You assigned to wMake and wMod making the unamed struct the active member so wId is the inactive member and you are not allowed to read from it without setting a value to it.

and is this what is meant by common initial sequence?

The common initial sequence is when two standard layout types share the same members in the same order. In

struct foo
{
int a;
int b;
};

struct bar
{
int a;
int b;
int c;
};

a and b are of the same type in foo and bar so they are the common initial sequence of them. If you put objects of foo and bar in a union it would be safe to read a or b from wither object after it is set in one of them.

This is not your case though since wId isn't a standard layout type struct.



Related Topics



Leave a reply



Submit