Array Placement-New Requires Unspecified Overhead in the Buffer

Array placement-new requires unspecified overhead in the buffer?

Update

Nicol Bolas correctly points out in the comments below that this has been fixed such that the overhead is always zero for operator new[](std::size_t, void* p).

This fix was done as a defect report in November 2019, which makes it retroactive to all versions of C++.

Original Answer

Don't use operator new[](std::size_t, void* p) unless you know a-priori the answer to this question. The answer is an implementation detail and can change with compiler/platform. Though it is typically stable for any given platform. E.g. this is something specified by the Itanium ABI.

If you don't know the answer to this question, write your own placement array new that can check this at run time:

inline
void*
operator new[](std::size_t n, void* p, std::size_t limit)
{
    if (n <= limit)
        std::cout << "life is good\n";
    else
        throw std::bad_alloc();
    return p;
}

int main()
{
    alignas(std::string) char buffer[100];
    std::string* p = new(buffer, sizeof(buffer)) std::string[3];
}

By varying the array size and inspecting n in the example above, you can infer y for your platform. For my platform y is 1 word. The sizeof(word) varies depending on whether I'm compiling for a 32 bit or 64 bit architecture.

Can placement new for arrays be used in a portable way?

Personally I'd go with the option of not using placement new on the array and instead use placement new on each item in the array individually. For example:

int main(int argc, char* argv[])
{
  const int NUMELEMENTS=20;

  char *pBuffer = new char[NUMELEMENTS*sizeof(A)];
  A *pA = (A*)pBuffer;

  for(int i = 0; i < NUMELEMENTS; ++i)
  {
    pA[i] = new (pA + i) A();
  }

  printf("Buffer address: %x, Array address: %x\n", pBuffer, pA);

  // dont forget to destroy!
  for(int i = 0; i < NUMELEMENTS; ++i)
  {
    pA[i].~A();
  }    

  delete[] pBuffer;

  return 0;
}

Regardless of the method you use, make sure you manually destroy each of those items in the array before you delete pBuffer, as you could end up with leaks ;)

Note: I haven't compiled this, but I think it should work (I'm on a machine that doesn't have a C++ compiler installed). It still indicates the point :) Hope it helps in some way!

Edit:

The reason it needs to keep track of the number of elements is so that it can iterate through them when you call delete on the array and make sure the destructors are called on each of the objects. If it doesn't know how many there are it wouldn't be able to do this.

Placement new is writing more bytes than array size

placement new of array may require more place than N * sizeof(Object)

(??? as compiler has to be able to call correctly the destructor with delete[] ???).

5.3.4 [expr.new]:
new(2,f) T[5] results in a call of operator new[](sizeof(T)*5+y,2,f).
Here, x and y are non-negative unspecified values representing array allocation overhead; the result of the new-expression will be offset by this amount from the value returned by operator new[]. This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions. The amount of overhead may vary from one invocation of new to another. —end example ]

C++ placement new

why is it using a char array to provide memory space for the placement new?

Why not? char is the smallest type that C++ defines, and on virtually every implementation, it is one byte in size. Therefore, it makes a good type to use when you need to allocate a block of memory of a certain size.

C++ also has very specific mechanics about how arrays of char (and only char are allocated. A new char[*], for example, will not be aligned to the alignment of char. It will be aligned to the maximum normal alignment for any type. Thus, you could use it to allocate memory and then construct any type into that memory.

Also the last line in the code above is allocating memory for an array of double, how is that possible when the original memory space contains a char array?

It is not allocating anything. It is constructing an array, using the memory you have given it. That's what placement new does, it constructs an object in the memory provided.

If the placement new is using the memory space of the char array, does this mean when we allocate the double array it overwrites the char array in that memory?

Yes.

treating memory returned by operator new(sizeof(T) * N) as an array

The C++ standards contain an open issue that underlying representation of objects is not an "array" but a "sequence" of unsigned char objects. Still, everyone treats it as an array (which is intended), so it is safe to write the code like:

char* storage = static_cast<char*>(operator new(sizeof(T)*size));
// ...
char* p = storage + sizeof(T)*i;  // precondition: 0 <= i < size
new (p) T(element);

as long as void* operator new(size_t) returns a properly aligned value. Using sizeof-multiplied offsets to keep the alignment is safe.

In C++17, there is a macro STDCPP_DEFAULT_NEW_ALIGNMENT, which specifies the maximum safe alignment for "normal" void* operator new(size_t), and void* operator new(std::size_t size, std::align_val_t alignment) should be used if a larger alignment is required.

In earlier versions of C++, there is no such distinction, which means that void* operator new(size_t) needs to be implemented in a way that is compatible with the alignment of any object.

As to being able to do pointer arithmetic directly on T*, I am not sure it needs to be required by the standard. However, it is hard to implement the C++ memory model in such a way that it would not work.

Dynamic allocation with C++'s placement new

First off, let's make sure we all agree on the separation of memory allocation and object construction. With that in mind, let's assume we have enough memory for an array of objects:

void * mem = std::malloc(sizeof(Foo) * N);

Now, you cannot use placement array-new, because it is broken. The correct thing to do is construct each element separately:

for (std::size_t i = 0; i != N; ++i)
{
    new (static_cast<Foo*>(mem) + i) Foo;
}

(The cast is only needed for the pointer arithmetic. The actual pointer required by placement-new is just a void pointer.)

This is exactly how the standard library containers work, by the way, and how the standard library allocators are designed. The point is that you already know the number of elments, because you used it in the initial memory allocation. Therefore, you have no need for the magic provided by C++ array-new, which is all about storing the array size somewhere and calling constructors and destructors.

Destruction works in reverse:

for (std::size_t i = 0; i != N; ++i)
{
    (static_cast<Foo*>(mem) + i)->~Foo();
}

std::free(mem);

One more thing you must know about, though: Exception safety. The above code is in fact not correct unless Foo has a no-throwing constructor. To code it correctly, you must also store an unwind location:

std::size_t cur = 0;
try
{
    for (std::size_t i = 0; i != N; ++i, ++cur)
    {
        new (static_cast<Foo*>(mem) + i) Foo;
    }
}
catch (...)
{
    for (std::size_t i = 0; i != cur; ++i)
    {
        (static_cast<Foo*>(mem) + i)->~Foo();
    }
    throw;
}

Is it OK not to call the destructor on placement new allocated objects?

The standard has a rule in section 3.8 [basic.life] that covers this:

A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling the destructor for an object of a class type with a non-trivial destructor. For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.

Lots of experts are in agreement that "depends on the side effects produced by the destructor" is far too vague to be useful. Many interpret it as a tautology meaning "If the program has undefined behavior when the destructor side effects are not evaluated, then failing to call the destructor causes undefined behavior". See Observable behavior and undefined behavior -- What happens if I don't call a destructor?

If your type has a trivial destructor (which appears to be the case in your example), then calling it (or failing to call it) has no effect whatsoever -- calling a trivial destructor does not even end the life of the object.

The lifetime of an object o of type T ends when:
if T is a class type with a non-trivial destructor, the destructor call starts, or
the storage which the object occupies is released, or is reused by an object that is not nested within o.

That is, if T doesn't have a non-trivial destructor, the only way to end the lifetime of object o is to release or reuse its storage.

Array Placement-New Requires Unspecified Overhead in the Buffer