Should I Store Entire Objects, or Pointers to Objects in Containers

Should I store entire objects, or pointers to objects in containers?

Since people are chiming in on the efficency of using pointers.

If you're considering using a std::vector and if updates are few and you often iterate over your collection and it's a non polymorphic type storing object "copies" will be more efficent since you'll get better locality of reference.

Otoh, if updates are common storing pointers will save the copy/relocation costs.

Considerations when choosing between storing values vs pointers in std:: containers

I would store object by value unless I need it through pointer. Possible reasons:

I need to store object hierarchy in a container (to avoid slicing)
I need shared ownership

There are possibly other reasons, but reasoning by size is only valid for some containers (std::vector for example) and even there you can make object moving cost minimal (reserve enough room in advance for example). You example for object size with std::map does not make any sense as std::map does not relocate objects when growing.

Note: return type of a method should not reflect the way you store it in a container, but rather it should be based on method semantics, ie what you would do if object is not found.

Map of Pointers versus Map of Structures/Containers (C++)

As I see it, there are a number of factors involved in deciding whether to use pointers vs. objects:

1. Do you or don't you need polymorphism?

If you want to maintain a container of base class objects, but then store objects of various derived classes in it, you must use pointers, because virtual function calls would otherwise not be resolved correctly.

2. The size of the objects you store and their suitability for copy operations

One of the key reasons why pointers may be preferrable to objects is that various operations performed on the container involve making copies of the objects stored in it. This is the case for many storing operations (e.g. std::vector<>::push_back() or std::map<>::insert()), some retrieval operations (e.g. std::vector<>::operator[], and then storing the object in a local variable), and some of the operations carried out by the container "internally", e.g. re-allocation of a vector when it grows beyond its capacity, or rehashing of std::unordered_map<>. Note that the copy operations may be less significant depending on how you choose the container and how make use of it (e.g. using std::vector<>::reserve() to allocate sufficient space, using std::vector<>::emplace_back() for storage, and never making a local copy of a retrieved element may mean that no copy is ever made).

However, if you expect a large number of copies to be made (or if profiling existing code reveals that many copies are made), using pointers instead of objects can obviously help as pointers are small and well-aligned in memory. Then again, this will not make much sense if the objects you store are actually smaller than pointers.

3. Other operations you perform on the container and its contents

Even if the objects you are dealing with are larger than pointers and you expect a significant amount of copy operations, using pointers is not necessarily preferrable. Consider a situation where you store a large number of medium-size objects (say, 16 bytes each) and you frequently need to iterate over the entire container and perform some sort of statistical calculation. When you store these objects directly in a vector, you get great cache-efficiency during iteration: As you retrieve one object, an entire cache-line will be retrieved from memory, hence making the retrieval of the next few objects much faster. This is generally not the case when pointers are used; on the contrary, after retrieving an element, the pointer must be dereferenced, causing another move operation from a memory region that is possibly not cached.

So clearly, it all depends on the type and size of objects you store, and the type and frequency of the operations you carry out. If the objects you are dealing with are various types of windows, buttons and menus of a GUI application, you will most likely want to use pointers and take advantage of polymorphism. If, on the other hand, you are dealing with huge structures of compact elements, all identical in size and shape, and the operations you perform involve frequent iteration or bulk copying, storing objects directly is perferrable. There may also be situations where the decision is hard to make without trying both and deciding based on the results of memory and time benchmarks.

As a final note, if you end up using pointers, consider whether the container you are building is the ultimate owner of the objects are you allocating on heap, or just maintains temporary pointers. If the container is the owner of those objects, you will be well-advised to use smart pointers rather than raw ones.

Performance of container of objects vs performance of container of pointers

For a Plain Old Data (POD) type, a vector of that type is always more efficient than a vector of pointers to that type at least until sizeof(POD) > sizeof(POD*).

Almost always, the same is true for a POD type at least until sizeof(POD) > 2 * sizeof(POD*) due to superior memory locality and lower total memory usage compared to when you are dynamically allocating the objects at which to be pointed.

This kind of analysis will hold true up until sizeof(POD) crosses some threshold for your architecture, compiler and usage that you would need to discover experimentally through benchmarking. The above only puts lower bounds on that size for POD types.

It is difficult to say anything definitive about all non-POD types as their operations (e.g. - default constructor, copy constructors, assignment, etc.) can be as inexpensive as a POD's or arbitrarily more expensive.

Containers in C++: pointers vs references

You maybe saw "T& at(key)" for your map, or perhaps something else that talked about references? Containers do take values as references, but that's just for efficiency. They are then copied into the container.

If you decide to put values into the container (option 1), then your item has to be copied when inserted, but then can be modified by reference:

blah &a = map1["a"]
a.foo = somethingelse;
//  No need to do this: map1["a"] = a;

Or, shorter:

map1["a"].foo = somethingelse;

When you do this with values, the map owns the object and will delete it when the map is deleted (among other times).

If you store raw pointers, you must manage the memory. I wouldn't advise that. I would instead consider putting shared_ptr or unique_ptr into your map. You do that if you need to have the value outside of the map stay alive even if the map is destroyed.

map<string,shared_ptr<blah>> map3;
shared_ptr<blah> myPtr = make_shared<blah>();
map3["a"] = myPtr;

Here, I can still use myPtr even after the map goes away. After everything pointing to the object is gone, the object will be deleted.

How to store object of different class types into one container in modern c++?

You can store different object types in a std::variant. If you do so, there is no need to have a common interface and use virtual functions.

Example:

class A
{
    public:
        void DoSomething() { std::cout << "DoSomething from A" << std::endl; }
};

class B
{
    public:
        void DoSomething() { std::cout << "DoSomething from B" << std::endl; }
};

int main()
{
    std::vector< std::variant< A, B > > objects;

    objects.push_back( A{} );
    objects.push_back( B{} );

    for ( auto& obj: objects )
    {
        std::visit( [](auto& object ){ object.DoSomething(); }, obj);
    }
}

But using this solutions can have also drawbacks. Access via std::visit may be slow. Sometimes e.g. gcc generates very bad code in such situations. ( jump table is generated in runtime, no idea why! ). You always call the function via table access which takes some additional time. And storing the objects in std::variant consumes always the size of the biggest class you have in the variant and in addition you need some space for the tag variable inside the variant.

The "old" way is to store raw or better smart-pointers into the vector and simply call via base pointer the common interface functions. The drawback here is the additional vtable pointer in each instance ( which is typically the same size as the tag variable in the std::variant ). The indirection with vtable access to call the function comes also with a ( small ) cost.

Example with smart pointer of base type and vector:

class Interface
{
    public:
        virtual void DoSomething() = 0;
        virtual ~Interface() = default;
};

class A: public Interface
{
    public:
        void DoSomething() override { std::cout << "DoSomething from A" << std::endl; }
        virtual ~A(){ std::cout << "Destructor called for A" << std::endl; }
};

class B: public Interface
{
    public:
        void DoSomething() override { std::cout << "DoSomething from B" << std::endl; }
        virtual ~B(){ std::cout << "Destructor called for B" << std::endl; }
};

int main()
{
    std::vector< std::shared_ptr<Interface>> pointers;

    pointers.emplace_back( std::make_shared<A>() );
    pointers.emplace_back( std::make_shared<B>() );

    for ( auto& ptr: pointers )
    {
        ptr->DoSomething();
    }
}

If std::unique_ptr is sufficient for you, you can use that one. It depends on the need of passing pointers around or not in your design.

Hint: If you are using pointers to base class type never forget to make your destructors virtual! See also: When to use virtual destructors

In your case I would vote to use smart-pointers of base class type in simple vector!

BTW:

virtual auto ObjType(void) -> TYPES

That look ugly to me! No need for auto here as the return type is known before you write the function parameter list. In such cases, where template parameters are need to be deduced to define the return type, it is needed, but not here! Please do not use always auto!

Pointers to objects in a set or in a vector - does it matter?

There's nothing wrong with storing pointers in a standard container - be it a vector, set, map, or whatever. You just have to be aware of who owns that memory and make sure that it's released appropriately. When choosing a container, choose the container that makes the most sense for your needs. vector is great for random access and appending but not so great for inserting elsewhere in the container. list deals with insertions extremely well, but it doesn't have random access. Sets ensure that there are no duplicates in the container and it's sorted (though the sorting isn't very useful if the set holds pointers and you don't give a comparator function) whereas a map is a set of key-value pairs, so sorting and access is done by key. Etc. Etc. Every container has its pros and cons and which is best for a particular situation depends entirely on that situation.

As for pointers, again, having pointers in containers is fine. The issue that you need to worry about is who owns the memory and therefore must worry about freeing it. If there is a clear object that owns what a particular pointer points to, then it should probably be that object which frees it. If it's essentially the container which owns the memory, then you need to make sure that you delete all of the pointers in the container before the container is destroyed.

If you are concerned with there being multiple pointers to the same data floating around or there is no clear owner for a particular chunk of memory, then smart pointers are a good solution. Boost's shared_ptr would probably be a good one to use, and shared_ptr will be part of C++0x. Many would suggest that you should always use shared pointers, but there is some overhead involved and whether it's best for your particular application will depend entirely on your application.

Ultimately, you need to be aware of the strengths and weaknesses of the various container types and determine what the best container is for whatever you're doing. The same goes for how to deal with pointer management. You need to write your program in a way that it's clear who owns a particular chunk of memory and make sure that that owner frees it when appropriate. Shared pointers are just one solution for that (albeit an excellent one). What the best solution is depends on the particulars of your program.

Should I Store Entire Objects, or Pointers to Objects in Containers