Is the Size of a Struct Required to Be an Exact Multiple of the Alignment of That Struct

Is the size of a struct required to be an exact multiple of the alignment of that struct?

One definition of alignment size:

The alignment size of a struct is the offset from one element to the next element when you have an array of that struct.

By its nature, if you have an array of a struct with two elements, then both need to have aligned members, so that means that yes, the size has to be a multiple of the alignment. (I'm not sure if any standard explicitly enforce this, but because the size and alignment of a struct don't depend on whether the struct is alone or inside an array, the same rules apply to both, so it can't really be any other way.)

why does size of the struct need to be a multiple of the largest alignment of any struct member

Good question. Consider this hypothetical type:

struct A {
int n;
bool flag;
};

So, an object of type A should take five bytes (four for the int plus one for the bool), but in fact it takes eight. Why?

The answer is seen if you use the type like this:

const size_t N = 100;
A a[N];

If each A were only five bytes, then a[0] would align but a[1], a[2] and most of the other elements would not.

But why does alignment even matter? There are several reasons, all hardware-related. One reason is that recently/frequently used memory is cached in cache lines on the CPU silicon for rapid access. An aligned object smaller than a cache line always fits in a single line (but see the interesting comments appended below), but an unaligned object may straddle two lines, wasting cache.

There are actually even more fundamental hardware reasons, having to do with the way byte-addressable data is transferred down a 32- or 64-bit data bus, quite apart from cache lines. Not only will misalignment clog the bus with extra fetches (due as before to straddling), but it will also force registers to shift bytes as they come in. Even worse, misalignment tends to confuse optimization logic (at least, Intel's optimization manual says that it does, though I have no personal knowledge of this last point). So, misalignment is very bad from a performance standpoint.

It usually is worth it to waste the padding bytes for these reasons.

Update: The comments below are all useful. I recommend them.

Why isn't sizeof for a struct equal to the sum of sizeof of each member?

This is because of padding added to satisfy alignment constraints. Data structure alignment impacts both performance and correctness of programs:

  • Mis-aligned access might be a hard error (often SIGBUS).
  • Mis-aligned access might be a soft error.

    • Either corrected in hardware, for a modest performance-degradation.
    • Or corrected by emulation in software, for a severe performance-degradation.
    • In addition, atomicity and other concurrency-guarantees might be broken, leading to subtle errors.

Here's an example using typical settings for an x86 processor (all used 32 and 64 bit modes):

struct X
{
short s; /* 2 bytes */
/* 2 padding bytes */
int i; /* 4 bytes */
char c; /* 1 byte */
/* 3 padding bytes */
};

struct Y
{
int i; /* 4 bytes */
char c; /* 1 byte */
/* 1 padding byte */
short s; /* 2 bytes */
};

struct Z
{
int i; /* 4 bytes */
short s; /* 2 bytes */
char c; /* 1 byte */
/* 1 padding byte */
};

const int sizeX = sizeof(struct X); /* = 12 */
const int sizeY = sizeof(struct Y); /* = 8 */
const int sizeZ = sizeof(struct Z); /* = 8 */

One can minimize the size of structures by sorting members by alignment (sorting by size suffices for that in basic types) (like structure Z in the example above).

IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation-defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragma statements to change the structure alignment settings.

sizeof giving unexpected result for my structure

The data members of struct are being aligned by default. There might be padding between these data members as well as padding after the last data member. In your case the padding will be most likely at the end.

The first data member is a pointer, which in your case requires 4 bytes of memory. Then although the other member is a char that requires only 1 byte of memory, there is a padding up to the multiple of 4, but the reason is not because "32 bits is what most computers are most comfortable with" as you say, but because 4 is the size of the largest data member.

Usually there is a pragma directive allowing you to specify custom alignment available. In Visual Studio, there is #pragma pack, that might help you in this case. Just make sure you know what you are doing. Although you will minimize the memory usage, it might negatively affect the performance of your code.

For more information have a look at related questions:

How to minimize the memory usage of a struct-type?

How does sizeof calculate the size of structures

Is the size of a struct required to be an exact multiple of the alignment of that struct?

or even Determining the alignment of C/C++ structures in relation to its members

Reordering bit-fields mysteriously changes size of struct

If we create an instance of struct foo, zero it out, set all bits in a field, and print the bytes, and do this for each field, we see the following:

R: ff 0f 00 00 00 00 00 00 00 00 
G: 00 00 ff 0f 00 00 00 00 00 00
B: 00 00 00 00 ff 0f 00 00 00 00
A: 00 00 00 00 00 00 ff 0f 00 00
X: 00 00 00 00 00 00 00 f0 00 00
Y: 00 00 00 00 00 00 00 00 0f 00

So what appears to be happening is that each 12 bit field is starting in a new 16 bit storage unit. Then the first 4 bit field fills out the remaining bits in the prior 16 bit unit, then the last field takes up 4 bits in the last unit. This occupies 9 bites And since the largest field, in this case a bitfield storage unit, is 2 bytes wide, one byte of padding is added at the end.

So it appears that is 12 bit field, which has a 16 bit base type, is kept within a single 16 bit storage unit instead of being split between multiple storage units.

If we do the same for the modified struct:

X: 0f 00 00 00 00 00 00 00 
R: f0 ff 00 00 00 00 00 00
G: 00 00 ff 0f 00 00 00 00
B: 00 00 00 00 ff 0f 00 00
A: 00 00 00 00 00 00 ff 0f
Y: 00 00 00 00 00 00 00 f0

We see that X takes up 4 bits of the first 16 bit storage unit, then R takes up the remaining 12 bits. The rest of the fields fill out as before. This results in 8 bytes being used, and so requires no additional padding.

While the exact details of the ordering of bitfields is implementation defined, the C standard does set a few rules.

From section 6.7.2.1p11:

An implementation may allocate any addressable storage unit large
enough to hold a bit- field. If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined. The order of
allocation of bit-fields within a unit (high-order to low-order or
low-order to high-order) is implementation-defined. The alignment of
the addressable storage unit is unspecified.

And 6.7.2.1p15:

Within a structure object, the non-bit-field members and the units in
which bit-fields reside have addresses that increase in the order in
which they are declared.

Is it possible to have a type with a larger alignment than its own size?

The Rust reference has this to say about size and alignment (emphasis mine):

Size and Alignment


[...]

The size of a value is the offset in bytes between successive elements in an array with that item type including alignment padding. The size of a value is always a multiple of its alignment. The size of a value can be checked with the size_of_val function.

Padding of Struct containing only structs

FooContainer is an array. Arrays in both C and C++ are guaranteed to not add padding between their elements. Any padding that may be present is only that which is internal to the element object type itself.

So yes, the sizeof trick is a common technique that is guaranteed to work, so long as the parameter to sizeof is indeed the name of an array, and not a pointer that was obtained by an array-to-pointer conversion.

Having said all that, since you tagged C++, try to avoid raw arrays. The C++ standard library has several alternatives that provide greater safety and more functionality.

And even if you do use a raw array, a better way to obtain the size in C++ would be with the help of the type system itself:

template<typename T, std::size_t N>
constexpr auto array_size(T(&)[N]) { return N; }

The above is very easy to use like so int structsincontainer = array_size(FooContainer); and will only accept an array reference, instead of silently building when passed a pointer by accident.



Related Topics



Leave a reply



Submit