Is Std::Unique_Ptr≪T≫ Required to Know the Full Definition of T

Is std::unique_ptrT required to know the full definition of T?

Adopted from here.

Most templates in the C++ standard library require that they be instantiated with complete types. However shared_ptr and unique_ptr are partial exceptions. Some, but not all of their members can be instantiated with incomplete types. The motivation for this is to support idioms such as pimpl using smart pointers, and without risking undefined behavior.

Undefined behavior can occur when you have an incomplete type and you call delete on it:

class A;
A* a = ...;
delete a;

The above is legal code. It will compile. Your compiler may or may not emit a warning for above code like the above. When it executes, bad things will probably happen. If you're very lucky your program will crash. However a more probable outcome is that your program will silently leak memory as ~A() won't be called.

Using auto_ptr<A> in the above example doesn't help. You still get the same undefined behavior as if you had used a raw pointer.

Nevertheless, using incomplete classes in certain places is very useful! This is where shared_ptr and unique_ptr help. Use of one of these smart pointers will let you get away with an incomplete type, except where it is necessary to have a complete type. And most importantly, when it is necessary to have a complete type, you get a compile-time error if you try to use the smart pointer with an incomplete type at that point.

No more undefined behavior:

If your code compiles, then you've used a complete type everywhere you need to.

class A
{
class impl;
std::unique_ptr<impl> ptr_; // ok!

public:
A();
~A();
// ...
};

shared_ptr and unique_ptr require a complete type in different places. The reasons are obscure, having to do with a dynamic deleter vs a static deleter. The precise reasons aren't important. In fact, in most code it isn't really important for you to know exactly where a complete type is required. Just code, and if you get it wrong, the compiler will tell you.

However, in case it is helpful to you, here is a table which documents several members of shared_ptr and unique_ptr with respect to completeness requirements. If the member requires a complete type, then entry has a "C", otherwise the table entry is filled with "I".

Complete type requirements for unique_ptr and shared_ptr

unique_ptr shared_ptr
+------------------------+---------------+---------------+
| P() | I | I |
| default constructor | | |
+------------------------+---------------+---------------+
| P(const P&) | N/A | I |
| copy constructor | | |
+------------------------+---------------+---------------+
| P(P&&) | I | I |
| move constructor | | |
+------------------------+---------------+---------------+
| ~P() | C | I |
| destructor | | |
+------------------------+---------------+---------------+
| P(A*) | I | C |
+------------------------+---------------+---------------+
| operator=(const P&) | N/A | I |
| copy assignment | | |
+------------------------+---------------+---------------+
| operator=(P&&) | C | I |
| move assignment | | |
+------------------------+---------------+---------------+
| reset() | C | I |
+------------------------+---------------+---------------+
| reset(A*) | C | C |
+------------------------+---------------+---------------+

Any operations requiring pointer conversions require complete types for both unique_ptr and shared_ptr.

The unique_ptr<A>{A*} constructor can get away with an incomplete A only if the compiler is not required to set up a call to ~unique_ptr<A>(). For example if you put the unique_ptr on the heap, you can get away with an incomplete A. More details on this point can be found in BarryTheHatchet's answer here.

Why does unique_ptrT::~unique_ptr need the definition of T?

The destructor of unique_ptr<Bar> calls Bar::~Bar when it delete's the Bar it owns. So ~unique_ptr<Bar> needs to see Bar::~Bar.

But template methods are only instantiated at point of use.

The unique ptr is destroyed by the Foo in Foo::~Foo. If ~Foo lives where it can see the definition of ~Bar, all is good.

If you leave it to be generated by the compiler, it 'lives' in the declaration of Foo, where it cannot see ~Bar.

If you forward declare it, then do a Foo::~Foo() = default or Foo::~Foo() {} in the .cpp file after #include <bar.h>, it can see ~Bar at the point where ~std::unique_ptr<Bar> is called`, and all is good.

This matters in practice because how Bar is destroyed differs depending on if ~Bar is virtual, and if Bar has parents, and if ~Bar is private/protected it might be illegal to call.

Use unique_ptrT with forward declearation leads to C2027 and C2338

You need to define the constructor in the .cpp file, not just the destructor.

If the Outer constructor fails, it will need to destroy m_ptr, that that requires knowing the definition of class Inner.

std::unique_ptr with an incomplete type won't compile

Here are some examples of std::unique_ptr with incomplete types. The problem lies in destruction.

If you use pimpl with unique_ptr, you need to declare a destructor:

class foo
{
class impl;
std::unique_ptr<impl> impl_;

public:
foo(); // You may need a def. constructor to be defined elsewhere

~foo(); // Implement (with {}, or with = default;) where impl is complete
};

because otherwise the compiler generates a default one, and it needs a complete declaration of foo::impl for this.

If you have template constructors, then you're screwed, even if you don't construct the impl_ member:

template <typename T>
foo::foo(T bar)
{
// Here the compiler needs to know how to
// destroy impl_ in case an exception is
// thrown !
}

At namespace scope, using unique_ptr will not work either:

class impl;
std::unique_ptr<impl> impl_;

since the compiler must know here how to destroy this static duration object. A workaround is:

class impl;
struct ptr_impl : std::unique_ptr<impl>
{
~ptr_impl(); // Implement (empty body) elsewhere
} impl_;

What's the reasoning behind std::unique_ptrT's constructor from T* being explicit?

unique_ptr takes ownership of passed pointer. Taking ownership should be explicit - you don't want some pointer to 'magically' become owned (and deleted) by some class (that was one of issues with deprecated std::auto_ptr).

for example:

void fun(std::unique_ptr<X> a) { .... }
X x;
fun(&x); // BOOM, deleting object on stack, fortunately it does not compile
fun(std::unique_ptr<X>(&x)); // compiles, but it's explicit and error is clearly visible

please note, that std::move is not required in return statement (special language exception - local variables as return arguments can be treated as 'moved').

Also - in C++14 you can use std::make_unique to make it less awkward:

return std::make_unique<some_data>(some_data_argument1, arg2);

(it can be also easily added to C++11 - read here)

std::unique_ptrT[] API prohibits derived-to-base pointer conversions

A hole in the type system is whenever the compiler doesn't catch when a type is cast to another incompatible type.

Imagine you have two simple classes:

class A
{
char i;
};

class B : public A
{
char j;
};

Let's for simplicity ignore things like padding etc. and assume that objects of type A are 1 byte and objects of type B are 2 byte.

Now when you have an array of type A or an array of type B, they will look like this:

A a[4]:

=================
| 0 | 1 | 2 | 3 |
|-------|-------|
| i | i | i | i |
=================

B b[4]:

=================================
| 0 | 1 | 2 | 3 |
|-------|-------|-------|-------|
| i | j | i | j | i | j | i | j |
=================================

Now imagine you have pointers to these arrays and then cast one to the other, this would obviously lead to problems:

a cast to B[4]:

=================================
| 0 | 1 | 2 | 3 |
|-------|-------|-------|-------|
| i | j | i | j | x | x | x | x |
=================================

The first two objects in the array will interpret the i member of the 2nd and 4th A as their j member. The 2nd and 3rd member access unallocated memory.

b cast to A[4]:

=================
| 0 | 1 | 2 | 3 |
|-------|-------|
| i | i | i | i | x | x | x | x |
=================

Here it's the other way around, all 4 objects now alternatingly interpret the i and the j of 2 B instances as their i members. And half of the array is lost.

Now imagine deleting such a casted array. Which destructors will be called? What memory will be freed? You are in deep hell at this point.

But wait, there's more.

Imagine you have 3 classes like this:

class A
{
char i;
};

class B1 : public A
{
float j;
};

class B2 : public A
{
int k;
};

And now you create an array of B1 pointers:

B1* b1[4];

If you cast that array to an array of A pointers you could think, "well this is fine, right"?

A** a = <evil_cast_shenanigans>(b1);

I mean, you can safely access each member as pointer to A:

char foo = a[0]->i; // This is valid

But what you can also do, is this:

a[0] = new B2{};   // Uh, oh.

This is a valid assignment, no compiler will complain, but you must not forget that we are actually working on an array that was created as an array of pointers to B1 objects. And it's first member now points at a B2 object, which you can now access as B1 without the compiler saying a thing.

float bar = b1[0]->j;   // Ouch.

So again you are in deep hell, and the compiler won't be able to warn you, except if that upcasting isn't allowed in the first place.

Why would std::unique_ptr API prohibit derived-to-base pointer conversions?

I hope the above explanations give good reasons why.

How could it prohibit the conversions?

It simply doesn't provide any API to do conversions. The shared_ptr API has conversion functions like static_pointer_cast, the unique_ptr API doesn't.

How works - Pointer / Unique_ptr without new

Short answer: It doesn't work.

This reference says that the default constructor of std::unique_ptr creates an empty unique pointer, meaning it has no associated object.

The reason why this code prints hello is because this statement

std::cout << "hello";

doesn't need anything of Bar. It could just as well be a static method. Maybe the compiler inlines the function and replaces s.use() with the std::cout-statement. But even if it does call the method, you won't notice any errors since it doesn't access the memory of Bar at all.

Make a slight change to your class and you will see what I mean:

class Bar
{
public:
Bar() : data(10) {};
~Bar() {};
void print() {
std::cout << "hello, data is: " << data;
}

int data;
};

Now, print accesses invalid memory, because you never called new (or even better: make_unique). It may even work and print something to the console, but the output of data will be garbage. If you're lucky, the application will crash.

Another reason why it appears to work (thanks Stas):

std::unique_ptr defines operator->, which simply returns the contained pointer, but does not check if the pointer points to valid memory. So pteste-> won't throw an exception.

Is having a reference/ptr to std::unique_ptr owned object safe when the unique_ptr is in a vector?

Since the ClassWithFooMember objects “in” such a vector are allocated separately, your foo_ptr (or a pointer to the entire ClassWithFooMember object) will remain valid regardless of any operations on v so long as the (anonymous) ClassWithFooMember object exists. For instance, sorting v or causing it to reallocate would be harmless. v.erase(v.begin()) would destroy it, of course, but even then you might first have written either of

auto p=std::move(v.front());
auto *q=v.front().release();

which would let the object live on after destroying v entirely.

All this is true regardless of the container type; this is the benefit paid for by the additional memory and time overhead used for the separate allocation. Neither is it specific to std::unique_ptr (although that’s generally a good choice here for other reasons); std::vector<T*> would have the same behavior, including that it would be safe to retain a T* (or a pointer into a T) but not a T*& (or T**) referring to the vector element itself. (The corresponding unsafe thing in your case would be to hold a std::unique_ptr<ClassWithFooMember>& or std::unique_ptr<ClassWithFooMember>*, which you generally shouldn’t be doing anyway.)

Why std::unique_ptr is not compatible with assignement operator?

Both of them are initialization, but not assignment. The 1st one is copy initialization, the 2nd one is direct initialization. The constructor of std::unique_ptr taking raw pointer is marked as explicit, it could be used in direct initialization but not copy initialization.

explicit unique_ptr( pointer p ) noexcept;

Direct-initialization is more permissive than copy-initialization: copy-initialization only considers non-explicit constructors and non-explicit user-defined conversion functions, while direct-initialization considers all constructors and all user-defined conversion functions.



Related Topics



Leave a reply



Submit