Do I Really Have to Worry About Alignment When Using Placement New Operator

Do I really have to worry about alignment when using placement new operator?

When you call placement new on a buffer:

A *a = new (buf) A;

you are invoking the built-in void* operator new (std::size_t size, void* ptr) noexcept as defined in:

c++11

18.6.1.3 Placement forms [new.delete.placement]


These functions are reserved, a C++ program may not define functions that displace the versions in the
Standard C++ library (17.6.4). The provisions of (3.7.4) do not apply to these reserved placement forms of
operator new and operator delete.


void* operator new(std::size_t size, void* ptr) noexcept;

Returns: ptr.

Remarks: Intentionally performs no other action.

The provisions of (3.7.4) include that the returned pointer should be suitably aligned, so it's fine for void* operator new (std::size_t size, void* ptr) noexcept to return a nonaligned pointer if one is passed in. This doesn't let you off the hook, though:

5.3.4 New [expr.new]


[14] Note: when the allocation function returns a value other than null, it must be a pointer to a block of storage
in which space for the object has been reserved. The block of storage is assumed to be appropriately aligned
and of the requested size.

So if you pass unaligned storage to a placement-new expression you're violating the assumption that the storage is aligned, and the result is UB.


Indeed, in your program above, if you replace long long b with __m128 b (after #include <xmmintrin.h>) then the program will segfault, as expected.

When should I worry about alignment?

struct Foo {
char data[3]; // size is 3, my arch is 64-bit (8 bytes)
};

Padding is allowed here, in the struct after the data member--but not before it, and not between the elements of data.

Foo array[4]; // total memory is 3 * 4 = 12 bytes. 

No padding is allowed between elements in the array here. Arrays are required to be contiguous. But, as noted above, padding is allowed inside of a Foo, following its data member. So, sizeof(someFoo.data) must be 3, but sizeof(someFoo) could be (and often will be 4).

void testArray() {
Foo * foo1 = array[0];
Foo * foo2 = array[1]; // is foo2 pointing to a non-aligned location?
// should I expect issues here?
}

Again, perfectly fine -- the compiler must allow this1.

For your memory pool, the prognosis isn't nearly as good though. You've allocated an array of char, which has to be sufficiently aligned to be accessed as char, but accessing it as any other type is not guaranteed to work. The implementation isn't allowed to impose any alignment limits on accessing data as char in any case though.

Typically for a situation like this, you create a union of all the types you care about, and allocate an array of that. This guarantees that the data is aligned to be used as an object of any type in the union.

Alternatively, you can allocate your block dynamically -- both malloc and operator ::new guarantee that any block of memory is aligned to be used as any type.

Edit: changing the pool to use vector<char> improves the situation, but only slightly. It means the first object you allocate will work because the block of memory held by the vector will be allocated (indirectly) with operator ::new (since you haven't specified otherwise). Unfortunately, that doesn't help much -- the second allocation may be completely misaligned.

For example, let's assume each type requires "natural" alignment -- i.e., alignment to a boundary equal to its own size. A char can be allocated at any address. We'll assume short is 2 bytes, and requires an even address and int and long are 4 bytes and require 4-byte alignment.

In this case, consider what happens if you do:

char *a = Foo.Allocate<char>();
long *b = Foo.Allocate<long>();

The block we started with had to be aligned for any type, so it was definitely an even address. When we allocate the char, we use up only one byte, so the next available address is odd. We then allocate enough space for a long, but it's at an odd address, so attempting to dereference it gives UB.


1 Mostly anyway -- ultimately, a compiler can reject just about anything under the guise of an implementation limit having been exceeded. I'd be surprised to see a real compiler have a problem with this though.

What are the alignment limitations of the standard global default operator new?

5.3.4/1 [expr.new]

It is implementation-defined whether over-aligned types are supported (3.11).

One important thing here: over-aligned means more aligned than any built-in type. For example, on 64 bits machine, pointers are generally 8 bytes aligned and thus on those machines over-aligned means having an alignment strictly greater than 8.

Therefore, over-aligned is only of concern when using vector types, such as those required for SSE or AVX instructions or some variants of C/C++ (like Open CL). In day to day programming, the types you craft from the built-in types are never over-aligned.

§3.11 Alignment [basic.align]

3/ An extended alignment is represented by an alignment greater than alignof(std::max_align_t). It is implementation-defined whether any extended alignments are supported and the contexts in which they are supported (7.6.2). A type having an extended alignment requirement is an over-aligned type.

9/ If a request for a specific extended alignment in a specific context is not supported by an implementation, the program is ill-formed. Additionally, a request for runtime allocation of dynamic storage for which the requested alignment cannot be honored shall be treated as an allocation failure.

Furthermore, it is customary for new to return memory aligned to alignof(std::max_align_t). This is because the regular ::operator new is only aware of the size of the object to allocate for, not of its alignment, and therefore need satisfy the strongest alignment requirements possible in the program.

On the other hand, beware of a char array allocated on the stack, there is no guarantee what its alignment would end up being.

Placement new and alignment in C++

Yes, the assertion will hold. Any new expression creating a single object must request exactly sizeof(Test) bytes of storage from the allocation function; and so it must place the object at the start of that storage in order to have enough room.

Note: This is based on the specification of a new-expression in C++11. It looks like C++14 will change the wording, so the answer may be different in the future.

Is it well-defined/legal to placement-new multiple times at the same address?

Peforming placement-new several times on the same block of memory is perfectly fine. Moreover, however strange it might sound, you are not even requred to destruct the object that already resides in that memory (if any). The standard explicitly allows that in 3.8/4

4 A program may end the lifetime of any object by reusing the storage
which the object occupies or by explicitly calling the destructor for
an object of a class type with a non-trivial destructor. For an object
of a class type with a non-trivial destructor, the program is not
required to call the destructor explicitly before the storage which
the object occupies is reused or released;[...]

In other words, it is your responsibility to take into account the consequences of not calling the destructor for some object.

However, calling the destructor on the same object twice as you do in your code is not allowed. Once you created the second object in the same region of memory, you effectively ended the lifetime of the first object (even though you never called its destructor). Now you only need to destruct the second object.

What uses are there for placement new?

Placement new allows you to construct an object in memory that's already allocated.

You may want to do this for optimization when you need to construct multiple instances of an object, and it is faster not to re-allocate memory each time you need a new instance. Instead, it might be more efficient to perform a single allocation for a chunk of memory that can hold multiple objects, even though you don't want to use all of it at once.

DevX gives a good example:

Standard C++ also supports placement
new operator, which constructs an
object on a pre-allocated buffer. This
is useful when building a memory pool,
a garbage collector or simply when
performance and exception safety are
paramount (there's no danger of
allocation failure since the memory
has already been allocated, and
constructing an object on a
pre-allocated buffer takes less time):

char *buf  = new char[sizeof(string)]; // pre-allocated buffer
string *p = new (buf) string("hi"); // placement new
string *q = new string("hi"); // ordinary heap allocation

You may also want to be sure there can be no allocation failure at a certain part of critical code (for instance, in code executed by a pacemaker). In that case you would want to allocate memory earlier, then use placement new within the critical section.

Deallocation in placement new

You should not deallocate every object that is using the memory buffer. Instead you should delete[] only the original buffer. You would have to then call the destructors of your classes manually. For a good suggestion on this, please see Stroustrup's FAQ on: Is there a "placement delete"?

Why should I use placement new?

the normal (nonplacement) new is basically equivalent to doing

T* ptr = static_cast<T*>(malloc(sizeof(T)));
new(ptr) T;

Of course the reality looks a bit different due to errorchecking and such, but the result is more or less the same (through not identical, you can't delete a pointer allocated that way, instead you need to call the destructor explicitely (ptr->~T()) and then release the memory using free).

So placement new should indeed be faster then non placement new, since it doesn't need to allocate the memory. However the problem is that the memory needs to be allocated somewhere. So you have essentially replaced one call to new with a call to placement new and some code for the allocation somewhere (if not why would you use new in the first place?). It should be obvious that this is less convinient and more bug prone.

Now of course you can write a faster allocation method, but for that you typically need to do some sort of tradeoff. It's not going to be easy to write a allocator which is faster without either using more memory (extra data for faster identification of free blocks) or making it very specific (writing fast allocation of a single objectsize is much easier then a general one). In the end it is typically not worth the effort (for scenarious where it is worth the effort it has likely already been done, so you could use an existing allocator (which likely uses placement new internally)).

There are of course uses for placement new (sometimes you do have the memory preallocated), but that is simply not the common case



Related Topics



Leave a reply



Submit