Can a C++ Compiler Re-Order Elements in a Struct

Can a C++ compiler re-order elements in a struct

It normally can't reorder elements, no.

An exception is if there's an access specifier separating them:

struct Foo {    
A a;
B b;
C c;
private:
D d;
E e;
F f;
};

a, b and c are guaranteed to be stored in this order, and d, e and f are guaranteed to be stored in order. But there is no guarantees about where a, b and c are stored relative to d, e and f.

Another thing to keep in mind is that the compiler can insert as much padding as it likes, even if it doesn't reorder anything.

Here's the relevant part of the standard:

Section 9.2.12:

Nonstatic data members of a
(non-union) class declared without an
intervening access-specifier are
allocated so that later members have
higher addresses within a class
object. The order of allocation of
nonstatic data members separated by an
access-specifier is unspecified
(11.1)"

Can a C compiler rearrange stack variables?

As there is nothing in the standard prohibiting that for C or C++ compilers, yes, the compiler can do that.

It is different for aggregates (i.e. structs), where the relative order must be maintained, but still the compiler may insert pad bytes to achieve preferable alignment.

IIRC newer MSVC compilers use that freedom in their fight against buffer overflows of locals.

As a side note, in C++, the order of destruction must be reverse order of declaration, even if the compiler reorders the memory layout.

(I can't quote chapter and verse, though, this is from memory.)

Does the order of members in a struct matter?

The order of fields in a struct does matter - the compiler is not allowed to reorder fields, so the size of the struct may change as the result of adding some padding.

In this case, however, you are defining a so-called flexible member, an array the size of which you can change. The rules for flexible members are that

  • There may never be more than one such member,
  • If present, the flexible member must be the last one in the struct, and
  • The struct must have at least one member in addition to the flexible one.

Take a look at this Q&A for a small illustration on using flexible structure members.

Order of fields in C/C++ structs

The general rules about field layout in C are:

  1. The address of the first member is the same as the address of the struct itself. That is, the offsetof of the member field is 0.
  2. The addresses of the members always increase in declaration order. That is, the offsetof of the n-th field is lower than that of the (n+1)-th member.

In C++, of course, that is only true if it is a standard layout type, that is roughly, a class or struct with no public/private/protected mixed members, no virtual functions and no members inherited from other classes.

Struct memory layout in C

In C, the compiler is allowed to dictate some alignment for every primitive type. Typically the alignment is the size of the type. But it's entirely implementation-specific.

Padding bytes are introduced so every object is properly aligned. Reordering is not allowed.

Possibly every remotely modern compiler implements #pragma pack which allows control over padding and leaves it to the programmer to comply with the ABI. (It is strictly nonstandard, though.)

From C99 §6.7.2.1:

12 Each non-bit-field member of a
structure or union object is aligned
in an implementation- defined manner
appropriate to its type.

13 Within a
structure object, the non-bit-field
members and the units in which
bit-fields reside have addresses that
increase in the order in which they
are declared. A pointer to a structure
object, suitably converted, points to
its initial member (or if that member
is a bit-field, then to the unit in
which it resides), and vice versa.
There may be unnamed padding within a
structure object, but not at its
beginning.

size of struct in C different with variables rearranged

What's going on in the background is alignment. Alignment is the requirement that a data type has an address divisible by some unit. If that alignment unit is the size of the type itself, then that is the strictest alignment that exists in C conforming implementations.

C compilers tend to ensure certain alignment in struct layouts, even when the requirement doesn't come from the target hardware.

If we have a long that is, say, 4 bytes, followed by a two-byte short, that short can be placed immediately after the long, because the 4 byte offset is more than sufficiently alignend for a two byte type. The offset after those two members is then 6. But then your compiler doesn't consider 6 to be a suitable alignment for a 4 byte int; it wants a multiple of 4. Two bytes of padding is inserted to move that int to offset 8.

Of course, the actual numbers are compiler-specific. You have to know the sizes of your types and the alignment requirements and rules.

Also, does this imply that we should always make sure to reorder struct members to minimize padding?

If minimal structure size is important in your application, then you have to order the members from most strictly aligned to least strictly aligned. If minimal structure size isn't important, then you don't have to care about this.

Other concerns may weigh in, like compatibility with an externally imposed layout.

Or incremental growth. If a publicly used structure (referenced by numerous instances of compiled code such as executables and dynamic libraries) is maintained over time across multiple versions, typically new members must be added only at the end. In that case, we don't get the optimal order for minimum size, even if we would like that.

Shouldn't short int be padded to 4 bytes as char of array size 3 is padded to 4 bytes?

No, because the one byte of padding after the char [4] array brings the offset to 4. That offset is more than sufficiently aligned for the placement of a two-byte short. Moreover, no padding is required after that short. Why? The offset after the short is 6. The most strictly aligned member of the structure is that short, with an alignment requirement of 2. 6 is divisible by 2.

Here is a situation in which alignment would be required after the two-byte short: struct { long x; short y; }. Say long is 4 bytes. Or, let's make it 8, doesn't matter. If we place the 2 byte short after the 8 byte long, we have a size of 10. That causes a problem if we declare an array a of this structure, because a[1].x will be at offset 10 from the base of the array: x is misaligned. The most strictly aligned structure member is x, with an alignment requirement of (say) 8, same as its size. Thus, for the sake of array alignment, the structure must be padded for its size to be to be divisible by 8. Thus, 6 bytes of padding at the end will be required to bring the size to 16.

Basically padding before a member is for its own alignment, and padding at the end of a structure is to ensure that all members are aligned in an array, and that is driven by the most strictly aligned member.

Alignment is a hard hardware requirement on some platforms! If, say, a four byte data type is accessed at an address not divisible by four, a CPU exception occurs. On some such platforms, the CPU exception can be handled by the operating system, which implements the misaligned access in software rather than passing a potentially fatal signal to the process. That access is then very expensive, probably requiring on the order of a few hundred instructions. I seem to recall that in the MIPS port of Linux, this is a per-process option: handling misaligned exceptions can be turned on for some non-portable programs (e.g. ones developed for Intel x86) that depend on it, yet not turned on for programs which only perform a misaligned access due to some corruption bug (e.g. uninitialized pointer aimed at valid memory by luck, but at a misaligned address).

On some platforms, the hardware handles misaligned access, but still at somewhat of a cost compared to aligned access. For instance, two memory accesses may have to be made instead of one.

C compilers tend to enforce alignment when allocating struct members and variables even for target machines that don't enforce alignment. This is likely done for various reasons like performance, and compatibility.

Can a C compiler add padding before the first element in a structure?

No, that's a place where the standard explicitly forbids placing any padding. The address of the first member of a struct and the address of the struct must be the same.

Section 6.7.2.1 (15) in the n1570 draft of the C2011 standard states:

Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

(emphasis mine)

sizeof() part of a C struct - sort of

Have you looked at the offsetof facility? It returns the offset of a member from the start of a struct. So offsetof (st, x2) returns the offset of x2 from the start of the struct. So in your example offsetof (st, x2) + sizeof(st.x2) will give you the count of bytes of the serialized components.

This is pretty similar to what you are doing now, you just get to ignore the padding after x2 and to use a rarely used piece of C.

incorrect members order in a C# structure

I'd expect that the root of your problem is that the three byte values

public byte securityCount;
public byte securityCRC;
public byte flag;

cause the next 32-bit values not to be word-aligned, and your two sets of code are adding (or not adding) internal padding differently.

I expect that the different packings look something like this:


C++ C#
================================ ================================
[size ][opcode ] [size ][opcode ]
[secCnt][secCrc][flag ][blow0 ] [secCnt][secCrc][flag ][blow0 ]
[blow1 ][blow2 ][blow3 ][blow4 ] [blow1 ][blow2 ][blow3 ][blow4 ]
[blow5 ][blow6 ][blow7 ][seedCou [blow5 ][blow6 ][blow7 ]..PAD...
nt ][seedCRC [seedCount ]
][seedSec [seedCRC ]
urity0 ][seedSec [seedSecurity0 ]
urity1 ][seedSec [seedSecurity1 ]
urity2 ][seedSec [seedSecurity2 ]
urity3 ][seedSec [seedSecurity3 ]
urity4 ] [seedSecurity4 ]

... with C# inserting a byte of padding which causes later values to be one byte off.

You can try using

[StructLayout(LayoutKind.Sequential,Pack=1)]

before your struct definition, which should use the minimum amount of space possible.

Mastering Structs in C# has some good information on how/why this happens.

Do class/struct members always get created in memory in the order they were declared?

C99 §6.7.2.1 clause 13 states:

Within a structure object, the
non-bit-field members and the units in
which bit-fields reside have addresses
that increase in the order in which
they are declared.

and goes on to say a bit more about padding and addresses. The C89 equivalent section is §6.5.2.1.

C++ is a bit more complicated. In the 1998 and 2003 standards, there is §9.2 clause 12 (clause 15 in C++11):

Nonstatic data members of a
(non-union) class declared without an
intervening access-specifier are
allocated so that later members have
higher addresses within a class
object. The order of allocation of
nonstatic data members separated by an
access-specifier is unspecified
(11.1). Implementation alignment
requirements might cause two adjacent
members not to be allocated
immediately after each other; so might
requirements for space for managing
virtual functions (10.3) and virtual
base classes (10.1).



Related Topics



Leave a reply



Submit