Memory Alignment:How to Use Alignof/Alignas

Memory alignment : how to use alignof / alignas?

Alignment is a restriction on which memory positions a value's first byte can be stored. (It is needed to improve performance on processors and to permit use of certain instructions that works only on data with particular alignment, for example SSE need to be aligned to 16 bytes, while AVX to 32 bytes.)

Alignment of 16 means that memory addresses that are a multiple of 16 are the only valid addresses.

alignas

force alignment to the required number of bytes. You can only align to powers of 2: 1, 2, 4, 8, 16, 32, 64, 128, ...

#include <cstdlib>
#include <iostream>

int main() {
    alignas(16) int a[4];
    alignas(1024) int b[4];
    printf("%p\n", a);
    printf("%p", b);
}

example output:

0xbfa493e0
0xbfa49000  // note how many more "zeros" now.
// binary equivalent
1011 1111 1010 0100 1001 0011 1110 0000
1011 1111 1010 0100 1001 0000 0000 0000 // every zero is just a extra power of 2

the other keyword

alignof

is very convenient, you cannot do something like

int a[4];
assert(a % 16 == 0); // check if alignment is to 16 bytes: WRONG compiler error

but you can do

assert(alignof(a) == 16);
assert(alignof(b) == 1024);

note that in reality this is more strict than a simple "%" (modulus) operation. In fact we know that something aligned to 1024 bytes is necessarily aligned to 1, 2, 4, 8 bytes but

 assert(alignof(b) == 32); // fail.

So to be more precise, "alignof" returns the greatest power of 2 to wich something is aligned.

Also alignof is a nice way to know in advance minimum alignment requirement for basic datatypes (it will probably return 1 for chars, 4 for float etc.).

Still legal:

alignas(alignof(float)) float SqDistance;

Something with an alignment of 16 then will be placed on the next available address that is a multiple of 16 (there may be a implicit padding from last used address).

Practical use cases for alignof and alignas C++ keywords

A common use case for the alignas specifier is for the scenario where you want to pass multiple objects between different threads through a queue (e.g., an event or task queue) while avoiding false sharing. False sharing will result from having multiple threads competing for the same cache line when they are actually accessing different objects. It is usually undesirable due to performance degradation.

For example – assuming that the cache line size is 64 bytes – given the following Event class:

struct Event {
   int event_type_;
};

The alignment of Event will correspond to the alignment of its data member, event_type_. Assuming that the alignment of int is 4 bytes (i.e., alignof(int) evaluates to 4), then up to 16 Event objects can fit into a single cache line. So, if you have a queue like:

std::queue<Event> eventQueue;

Where one thread pushes events into the back of the queue, and another thread pulls events from the front, we may have both threads competing for the same cache line. However, by properly using the alignas specifier on Event:

struct alignas(64) Event {
   int event_type_;
};

This way, an Event object will always be aligned on a cache line boundary so that a cache line will contain an Event object at most. Therefore two or more threads will never be competing for the same cache line when accessing distinct Event objects (if multiple threads are accessing the same Event object, they will obviously compete for the same cache line).

What are the alignas and alignof keywords used for?

Some special types must be aligned at more bytes than usual- for example, matrices must be aligned at 16bytes on x86 for the most efficient copying to the GPU. SSE vector types can behave this way too. As such, if you want to make a container type, then you must know the alignment requirements of the type you're trying to contain or allocate.

aligned_malloc() vs alignas() for Constant Buffers

Yes, you could use it like this:

struct SceneConstantBuffer
{
    alignas(16) DirectX::XMFLOAT4X4 ViewProjection[2];
    alignas(16) DirectX::XMFLOAT4 EyePosition[2];
    alignas(16) DirectX::XMFLOAT3 LightDirection{};
    alignas(16) DirectX::XMFLOAT3 LightDiffuseColor{};
    alignas(16) int NumSpecularMipLevels{ 1 };
};

What won't work is __declspec(align)...

EDIT: If you want to use it on the struct itself something similar to this should work too:

struct alignas(16) SceneConstantBuffer
{
    DirectX::XMMATRIX ViewProjection; // 16-bytes
    ...
    DirectX::XMFLOAT3 LightDiffuseColor{};
}

How to tell the maximum data alignment requirement in C++

You might be looking for std::max_align_t

Does the alignas specifier work with 'new'?

Before C++17, if your type's alignment is not over-aligned, then yes, the default new will work. "Over-aligned" means that the alignment you specify in alignas is greater than alignof(std::max_align_t). The default new will work with non-over-aligned types more or less by accident; the default memory allocator will always allocate memory with an alignment equal to alignof(std::max_align_t).

If your type's alignment is over-aligned however, your out of luck. Neither the default new, nor any global new operator you write, will be able to even know the alignment required of the type, let alone allocate memory appropriate to it. The only way to help this case is to overload the class's operator new, which will be able to query the class's alignment with alignof.

Of course, this won't be useful if that class is used as the member of another class. Not unless that other class also overloads operator new. So something as simple as new pair<over_aligned, int>() won't work.

C++17 adds a number of memory allocators which are given the alignment of the type being used. These allocators are used specifically for over-aligned types (or more specifically, new-extended over-aligned types). So new pair<over_aligned, int>() will work in C++17.

Of course, this only works to the extent that the allocator handles over-aligned types.

Where can I use alignas() in C++11?

You cannot apply an alignment to a typedef. In the C++ model of alignment specifiers, the alignment is an inseparable part of the type itself, and a typedef does not create a new type (it only provides a new name for an existing type) so it is not meaningful to apply an alignment specifier in a typedef declaration.

From [dcl.align] (7.6.2)p1:

An alignment-specifier may be applied to a variable or to a class data member [...]. An alignment-specifier may also be applied to the declaration or definition of a class (in an elaborated-type-specifier (7.1.6.3) or class-head (Clause 9), respectively) and to the declaration or definition of an enumeration (in an opaque-enum-declaration
or enum-head, respectively (7.2)).

These are the only places where the standard says an alignment-specifier (alignas(...)) may be applied. Note that this does not include typedef declarations nor alias-declarations.

Per [dcl.attr.grammar] (7.6.1)p4:

If an attribute-specifier-seq that appertains to some entity or statement contains an attribute that is not allowed to apply to that entity or statement, the program is ill-formed.

This wording was intended to apply to alignas as well as the other forms of attribute that may appear within an attribute-specifier-seq, but was not correctly updated when alignment switched from being a "real" attribute to being a different kind of attribute-specifier-seq.

So: your example code using alignas is supposed to be ill-formed. The C++ standard does not currently explicitly say this, but it also does not permit the usage, so instead it currently would result in undefined behavior (because the standard does not define any behavior for it).

Memory Alignment:How to Use Alignof/Alignas