How to Use Alignas() in C++11

Where can I use alignas() in C++11?

You cannot apply an alignment to a typedef. In the C++ model of alignment specifiers, the alignment is an inseparable part of the type itself, and a typedef does not create a new type (it only provides a new name for an existing type) so it is not meaningful to apply an alignment specifier in a typedef declaration.

From [dcl.align] (7.6.2)p1:

An alignment-specifier may be applied to a variable or to a class data member [...]. An alignment-specifier may also be applied to the declaration or definition of a class (in an elaborated-type-specifier (7.1.6.3) or class-head (Clause 9), respectively) and to the declaration or definition of an enumeration (in an opaque-enum-declaration
or enum-head, respectively (7.2)).

These are the only places where the standard says an alignment-specifier (alignas(...)) may be applied. Note that this does not include typedef declarations nor alias-declarations.

Per [dcl.attr.grammar] (7.6.1)p4:

If an attribute-specifier-seq that appertains to some entity or statement contains an attribute that is not allowed to apply to that entity or statement, the program is ill-formed.

This wording was intended to apply to alignas as well as the other forms of attribute that may appear within an attribute-specifier-seq, but was not correctly updated when alignment switched from being a "real" attribute to being a different kind of attribute-specifier-seq.

So: your example code using alignas is supposed to be ill-formed. The C++ standard does not currently explicitly say this, but it also does not permit the usage, so instead it currently would result in undefined behavior (because the standard does not define any behavior for it).

Memory alignment : how to use alignof / alignas?

Alignment is a restriction on which memory positions a value's first byte can be stored. (It is needed to improve performance on processors and to permit use of certain instructions that works only on data with particular alignment, for example SSE need to be aligned to 16 bytes, while AVX to 32 bytes.)

Alignment of 16 means that memory addresses that are a multiple of 16 are the only valid addresses.

alignas

force alignment to the required number of bytes. You can only align to powers of 2: 1, 2, 4, 8, 16, 32, 64, 128, ...

#include <cstdlib>
#include <iostream>

int main() {
alignas(16) int a[4];
alignas(1024) int b[4];
printf("%p\n", a);
printf("%p", b);
}

example output:

0xbfa493e0
0xbfa49000 // note how many more "zeros" now.
// binary equivalent
1011 1111 1010 0100 1001 0011 1110 0000
1011 1111 1010 0100 1001 0000 0000 0000 // every zero is just a extra power of 2

the other keyword

alignof

is very convenient, you cannot do something like

int a[4];
assert(a % 16 == 0); // check if alignment is to 16 bytes: WRONG compiler error

but you can do

assert(alignof(a) == 16);
assert(alignof(b) == 1024);

note that in reality this is more strict than a simple "%" (modulus) operation. In fact we know that something aligned to 1024 bytes is necessarily aligned to 1, 2, 4, 8 bytes but

 assert(alignof(b) == 32); // fail.

So to be more precise, "alignof" returns the greatest power of 2 to wich something is aligned.

Also alignof is a nice way to know in advance minimum alignment requirement for basic datatypes (it will probably return 1 for chars, 4 for float etc.).

Still legal:

alignas(alignof(float)) float SqDistance;

Something with an alignment of 16 then will be placed on the next available address that is a multiple of 16 (there may be a implicit padding from last used address).

Practical use cases for alignof and alignas C++ keywords

A common use case for the alignas specifier is for the scenario where you want to pass multiple objects between different threads through a queue (e.g., an event or task queue) while avoiding false sharing. False sharing will result from having multiple threads competing for the same cache line when they are actually accessing different objects. It is usually undesirable due to performance degradation.

For example – assuming that the cache line size is 64 bytes – given the following Event class:

struct Event {
int event_type_;
};

The alignment of Event will correspond to the alignment of its data member, event_type_. Assuming that the alignment of int is 4 bytes (i.e., alignof(int) evaluates to 4), then up to 16 Event objects can fit into a single cache line. So, if you have a queue like:

std::queue<Event> eventQueue;

Where one thread pushes events into the back of the queue, and another thread pulls events from the front, we may have both threads competing for the same cache line. However, by properly using the alignas specifier on Event:

struct alignas(64) Event {
int event_type_;
};

This way, an Event object will always be aligned on a cache line boundary so that a cache line will contain an Event object at most. Therefore two or more threads will never be competing for the same cache line when accessing distinct Event objects (if multiple threads are accessing the same Event object, they will obviously compete for the same cache line).

Applying alignas() to an entire struct in C

C11 is not very clear on these things, but a consensus has emerged how this is to be interpreted. C17 will have some of this clarified. The idea of not allowing types to be aligned is that there should never be different alignment requirements for compatible types between compilation units. If you want to force the alignment of a struct type, you'd have to impose an alignment on the first member. By that you'd create an incompatible type.

The start of the "Constraint" section as voted by the committee reads:

An alignment specifier shall appear only in the declaration specifiers
of a declaration, or in the specifier-qualifier list of a member
declaration, or in the type name of a compound literal. An alignment
specifier shall not be used in conjunction with either of the
storage-class specifiers typedef or register, nor in a declaration of
a function or bit-field.

C++ alignment (when to use alignas)

There are plenty of use cases where alignas is handy in multi threaded applications which are latency sensitive. Eg. High frequency trading applications.

Alignas provides tighter control over how your objects layout on CPU Caches to make access to the objects faster. Goals are as follows for optimal use which are the use cases for use of alignas

  1. You want to avoid unnecessary invalidation of your data from cache lines
  2. You want to optimize the CPU reads such that wastage of CPU cycles can be saved.

How does alignment to cache lines using alignas helps
Use 1 - Avoiding unnecessary invalidation of data from cache line
You can use alignas to keep the addresses or objects used by separate threads running on separate cache lines, so that one thread does not inadvertently invalidate cache line of another core.

How this happens:
Consider the case when a thread in your process is running on core 0 and is writing to address say xxxx. This address is now loaded into L1 cache of core 0.
Thread no. 2 is accessing address xxxx + n bytes. Now if both these addresses happen to be on same cache line, then any writes by thread 2 will unnecessary invalidate the cache line of core 0. Thus thread 0 is delayed until the cache line is invalidated and loaded again. This hampers the performance in multi threaded environment.

Use 2
Align your objects to separate cache lines, such that the objects are not spread across multiple cache lines. This saves CPU cycles. Eg. If your object size is say for eg. 118 bytes, it's better to align it to 64 bytes since on most processor the cache line size is 64 bytes now.

If you don't do it your object may be laid out as follows on 64 bytes cache lines. (Eg. taken such that the object has actual size of say 118 bytes and with natural alignment, size becomes multiple of 4, thus 120 bytes)

Cache line 1<-----Object 1 60Bytes --> <---your Object 4> Bytes ---------->
Cache line 2<--------- Your object 64 Bytes --------------------------------->
Cache line 3 <----- Your object 52 bytes -----> <--- Some other object 12 Bytes -->

Since CPU reads in multiple of cache lines, your object would be read in 3 cpu cycles. Consider alignas(64) if you want to optimize it. With this, your object would always be spread on 2 cache lines.

Caveats
Please note that you need to carefully examine your objects before considering alignas. Reason being a wrong methodology would lead to more padding and thus more wastage of L2 cache. There are simple techniques of arranging the data members in sequence such that it avoids wastage.

Hope this helps and Good Luck!

What are the alignas and alignof keywords used for?

Some special types must be aligned at more bytes than usual- for example, matrices must be aligned at 16bytes on x86 for the most efficient copying to the GPU. SSE vector types can behave this way too. As such, if you want to make a container type, then you must know the alignment requirements of the type you're trying to contain or allocate.

_Alignas for struct members using clang & C11

-Weverything prints all diagnostic messages required by C as well as some diagnostics not required by C. The diagnostic that is printed here is not required by C: its purpose is informative and your program is already strictly conforming. C says an implementation is free to produce additional diagnostic messages as long as it does not fail to translate the program.

alignas specifier vs __attribute__(aligned), c++11

It seems from the GCC support status, alignment support is not fully supported in gcc 4.7, but it is for gcc 4.8. alignas is also listed as a newly supported feature from the 4.8 release page.

Also, from the alignment support proposal (3.11):

A fundamental alignment is represented by an alignment less than or equal to the greatest alignment supported by the implementation in all contexts, which is equal to alignof(std::max_align_t) (18.1).

An extended alignment is represented by an alignment greater than
alignof(std::max_align_t). It is implementation-defined whether any extended
alignments are supported and the contexts in which they are supported (7.1.6). A type
having an extended alignment requirement is an over-aligned type.

And from the same document (7.1.6):

if the constant expression evaluates to an extended alignment and the implementation
does not support that alignment in the context of the declaration, the program is illformed

That might be part of the answer too. I don't have access to the full standard at the moment, someone should be able to confirm this.

As for the difference between __attribute__(aligned) and alignas, i don't think they are semantically different, but one is just a compiler extension while the other is fully defined by the standard.

To answer your last question, alignas is only defined for:

alignas ( constant-expression ) 
alignas ( type-id )

Does the alignas specifier work with 'new'?

Before C++17, if your type's alignment is not over-aligned, then yes, the default new will work. "Over-aligned" means that the alignment you specify in alignas is greater than alignof(std::max_align_t). The default new will work with non-over-aligned types more or less by accident; the default memory allocator will always allocate memory with an alignment equal to alignof(std::max_align_t).

If your type's alignment is over-aligned however, your out of luck. Neither the default new, nor any global new operator you write, will be able to even know the alignment required of the type, let alone allocate memory appropriate to it. The only way to help this case is to overload the class's operator new, which will be able to query the class's alignment with alignof.

Of course, this won't be useful if that class is used as the member of another class. Not unless that other class also overloads operator new. So something as simple as new pair<over_aligned, int>() won't work.

C++17 adds a number of memory allocators which are given the alignment of the type being used. These allocators are used specifically for over-aligned types (or more specifically, new-extended over-aligned types). So new pair<over_aligned, int>() will work in C++17.

Of course, this only works to the extent that the allocator handles over-aligned types.

How to use alignas to replace pragma pack?

alignas cannot replace #pragma pack.

GCC accepts the alignas declaration, but still keeps the member properly aligned: satisfying the strictest alignment requirement (in this case, the alignment of long) also satisfies the requirement you specified.

However, GCC is too lenient as the standard actually explicitly forbids this in §7.6.2, paragraph 5:

The combined effect of all alignment-specifiers in a declaration shall not specify an alignment that is less strict than the alignment that would be required for the entity being declared if all alignment-specifiers were omitted (including those in other declarations).



Related Topics



Leave a reply



Submit