Type Erasure Techniques

What is type erasure in C++?

Here's a very simple example of type erasure in action:

// Type erasure side of things

class TypeErasedHolder
{
  struct TypeKeeperBase
  {
    virtual ~TypeKeeperBase() {}
  };

  template <class ErasedType>
  struct TypeKeeper : TypeKeeperBase
  {
    ErasedType storedObject;

    TypeKeeper(ErasedType&& object) : storedObject(std::move(object)) {}
  };

  std::unique_ptr<TypeKeeperBase> held;

public:
  template <class ErasedType>
  TypeErasedHolder(ErasedType objectToStore) : held(new TypeKeeper<ErasedType>(std::move(objectToStore)))
  {}
};

// Client code side of things

struct A
{
  ~A() { std::cout << "Destroyed an A\n"; }
};

struct B
{
  ~B() { std::cout << "Destroyed a B\n"; }
};

int main()
{
  TypeErasedHolder holders[] = { A(), A(), B(), A() };
}

[Live example]

As you can see, TypeErasedHolder can store objects of an arbitrary type, and destruct them correctly. The important point is that it does not impose any restrictions on the types supported⁽¹⁾: they don't have to derive from a common base, for example.

⁽¹⁾ Except for being movable, of course.

C++ Techniques: Type-Erasure vs. Pure Polymorphism

C++ style virtual method based polymorphism:

You have to use classes to hold your data.
Every class has to be built with your particular kind of polymorphism in mind.
Every class has a common binary-level dependency, which restricts how the
compiler creates the instance of each class.
The data you are abstracting must explicitly describe an interface that describes
your needs.

C++ style template based type erasure (with virtual method based polymorphism doing the erasure):

You have to use template to talk about your data.
Each chunk of data you are working on may be completely unrelated to other options.
The type erasure work is done within public header files, which bloats compile time.
Each type erased has its own template instantiated, which can bloat binary size.
The data you are abstracting need not be written as being directly dependent on your needs.

Now, which is better? Well, that depends if the above things are good or bad in your particular situation.

As an explicit example, std::function<...> uses type erasure which allows it to take function pointers, function references, output of a whole pile of template-based functions that generate types at compile time, myraids of functors which have an operator(), and lambdas. All of these types are unrelated to one another. And because they aren't tied to having a virtual operator(), when they are used outside of the std::function context the abstraction they represent can be compiled away. You couldn't do this without type erasure, and you probably wouldn't want to.

On the other hand, just because a class has a method called DoFoo, doesn't mean that they all do the same thing. With polymorphism, it isn't just any DoFoo you are calling, but the DoFoo from a particular interface.

As for your sample code... your GetSomeText should be virtual ... override in the polymorphism case.

There is no need to reference count just because you are using type erasure. There is no need not to use reference counting just because you are using polymorphsm.

Your Object could wrap T*s like how you stored vectors of raw pointers in the other case, with manual destruction of their contents (equivalent to having to call delete). Your Object could wrap a std::shared_ptr<T>, and in the other case you could have vector of std::shared_ptr<T>. Your Object could contain a std::unique_ptr<T>, equivalent to having a vector of std::unique_ptr<T> in the other case. Your Object's ObjectModel could extract copy constructors and assignment operators from the T and expose them to Object, allowing full-on value semantics for your Object, which corresponds to the a vector of T in your polymorphism case.

Type erasure for methods with differing in return types

Type erasure can and has been implemented in C++ in different contexts. The most common approach, which is used in boost::any, std::function< signature >, std::thread and others is based on a non-polymorphic class that is the type erased object, which contains a pointer to an interface type. Internally, during construction, assignment or whenever the user type is erased, an implementation of the interface is instantiated and stored.

As a motivating simplified example, consider that we wanted to create a printable type that can be used to print any type that implements operator<< to std::cout using type erasure. For that we need the type printable, the internal interface printable_impl_base, and the actual implementations:

// regular polymorphic hierarchy:
struct printable_impl_base {
   virtual ~printable_impl_base() {}
   virtual void print() const = 0;
};
template <typename T>
struct printable_impl : printable_impl_base {
   T copy_to_print;
   printable_impl( T const & o ) : copy_to_print( o ) {}
   virtual void print() const {
      std::cout << copy_to_print << std::endl;
   }
};

// type erasure is performed in printable:
class printable {
   std::shared_ptr<printablable_impl_base> p;
public:
   template <typename T>
   printable( T obj ) : p( new printable_impl<T>(obj) ) {}
   void print() const {
      p->print();
   }
};

Note that the pattern is very similar to a regular polymorphic hierarchy, with the difference that an interface object is added that is a value type (borrowing the term value type from C#), that holds the actual polymorphic objects inside.

Looking at it this way, it seems kind of simplistic and useless, but that is the fuel that drives boost::any (the internal interface is only a typeid), std::function< void () > (the internal interface is that it implements void operator()), or shared_ptr<> (the interface is the deleter method, that relinquishes the resource).

There is one specific different type of type erasure when the only thing that needs to be done with the type that implements type erasure is to destroy it: use a temporary and bind it to a constant reference... But this is very specific, if you want you can read about it here: http://drdobbs.com/cpp/184403758

In the specific case that you are talking about in the question it is a bit more complex, because you don't want to erase a single type, but rather a couple of them. The Iterable interface must erase the type of the container that it internally holds, and in doing so it has to provide it's own iterators that have to perform type erasure on the iterators from the container. Still, the idea is basically the same, just more work to do to implement.

Template type erasure

Doing this with a compile-time check is, unfortunately, not feasible. You can, however, provide that functionality with a runtime check.

A map's value type can only be one single type, and Foo<T> is a different type for each T. However, we can work around this by giving every Foo<T> a common base class, have a map of pointers to it, and use a virtual function to dispatch call() to the appropriate subclass.

For this though, the type of the argument must also always be the same. As mentioned by @MSalters, std::any can help with that.

Finally, we can wrap all that using the pimpl pattern so that it looks like there's just a single neat Foo type:

#include <cassert>
#include <string>
#include <functional>
#include <any>
#include <unordered_map>
#include <memory>

struct Foo {
public:
  template<typename T, typename FunT>
  void set(FunT fun) {
      pimpl_ = std::make_unique<FooImpl<T, FunT>>(std::move(fun));
  }

  // Using operator()() instead of call() makes this a functor, which
  // is a little more flexible.
  void operator()(const std::any& arg) {
      assert(pimpl_);
      pimpl_->call(arg);
  }
  
private:
    struct IFooImpl {
      virtual ~IFooImpl() = default;
      virtual void call( const std::any& arg ) const = 0; 
    };

    template <class Arg, typename FunT>
    struct FooImpl : IFooImpl
    {
        FooImpl(FunT fun) : fun_(std::move(fun)) {}
        
        void call( const std::any& arg ) const override {
            fun_(std::any_cast<Arg>(arg));
        }

    private:
        FunT fun_;
    };

  std::unique_ptr<IFooImpl> pimpl_;
};


// Usage sample
#include <iostream>

void bar(int v) {
    std::cout << "bar called with: " << v << "\n";
}

int main() {
    std::unordered_map<std::string, Foo> table;

    table["aaa"].set<int>(bar);

    // Even works with templates/generic lambdas!
    table["bbb"].set<float>([](auto x) {
        std::cout << "bbb called with " << x << "\n";
    });

    table["aaa"](14);
    table["bbb"](12.0f);
}

see on godbolt

How are self-destructing type erasure classes like std::function implemented?

std::function lost its allocators in C++17 in part because of problems with type erased allocators. However, the general pattern is to rebind the allocator to whatever type you're using to do the type erasure, store the original allocator in the type erased thing, and rebind the allocator again when deleting the type erased thing.

template<class Ret, class... Args>
struct Call_base {
    virtual Ret Call(Args&&...);
    virtual void DeleteThis();
protected:
    ~Call_base() {}
};

template<class Allocator, class Fx, class Ret, class... Args>
struct Call_fn : Call_base<Ret, Args...> {
    Allocator a;
    decay_t<Fx> fn;

    Call_fn(Allocator a_, Fx&& fn_)
        : a(a_), fn(forward<Fx>(fn_))
        {}

    virtual Ret Call(Args&& vals) override {
        return invoke(fn, forward<Args>(vals)...);
    }
    virtual void DeleteThis() override {
        // Rebind the allocator to an allocator to Call_fn:
        using ReboundAllocator = typename allocator_traits<Allocator>::
            template rebind_alloc<Call_fn>;
        ReboundAllocator aRebound(a);
        allocator_traits<ReboundAllocator>::destroy(aRebound, this);
        aRebound.deallocate(this, 1);
    }
};

template<class Allocator, class Fx, class Ret, class... Args>
Call_base<Ret, Args...> * Make_call_fn(Allocator a, Fx&& fn) {
    using TypeEraseType = Call_fn<Allocator, Fx, Ret, Args...>;
    using ReboundAllocator = typename allocator_traits<Allocator>::
        template rebind_alloc<TypeEraseType>;
    ReboundAllocator aRebound(a);
    auto ptr = aRebound.allocate(1); // throws
    try {
        allocator_traits<ReboundAllocator>::construct(aRebound, ptr, a, forward<Fx>(fn));
    } catch (...) {
        aRebound.deallocate(ptr, 1);
        throw;
    }

    return ptr;
}

How to apply the type erasure technique to existing types?

boost has a full on type erasure library that lets you pick what you erase.

However, a simple case like the above can be done as follows:

// The interface detailing what we support.  Move is always supported.
// only need clone if we want to copy:
struct IImpl {
  virtual std::unique_ptr<IImpl> clone() const = 0;
  virtual int f() = 0;
  virtual ~IImpl() {}
};

// The pImpl<T> class writes custom code to type erase a type T.
// you can specialize it for extreme cases: ie, suppose for some type X
// you want f() to invoke g() -- then you can specialize Impl to do that.
// (that is a bit of a toy example, and a bad one, but imagine std::function
// specialized for method pointers such that `this` is turned into the first
// argument)
template<class T>
struct Impl:IImpl {
  T t;
  virtual std::unique_ptr<IImpl> clone() const override {
    return std::unique_ptr<IImpl>( new Impl(t) );
  }
  virtual int f() const override {
    return t.f();
  }
  virtual ~Impl() {}
  template<typename...Us>
  explicit Impl( Us&&... us ): t(std::forward<Us>(us)...) {}
  // copy is handled by clone.  move is handled by unique_ptr:
  Impl( Impl const& ) = delete;
  Impl& operator=( Impl const& ) = delete;
};

// the value-semantics class that type-erases:
struct TypeErased {
  std::unique_ptr<IImpl> pImpl; // where the work is mostly done
  int f() { return pImpl->f(); } // forward to where the work is mostly done
  template<typename T, typename... Us> // pass T explicitly, allow construction from other types
  void emplace( Us&&... us ) { pImpl.reset( new Impl<T>(std::forward<Us>(us)...) ); }
  template<typename T> // like std::function, sucks in similar ways
  explicit TypeErased( T&& t ): pImpl( new Impl<typename std::decay<T>::type>(std::forward<T>(t)) {};
  TypeErased(TypeErased&&) = default;
  TypeErased(TypeErased const&o): pImpl( o.pImpl?o.pImpl->clone():nullptr ) {}
  TypeErased(TypeErased const&&o):TypeErased(o) {} // delegate to const&, no need to cast here
  TypeErased(TypeErased&o):TypeErased( const_cast<TypeErased const&>(o) {} // delegate to const&

  TypeErased& operator=(TypeErased&&) = default; // moving the unique_ptr does the right thing
  TypeErased& operator=(TypeErased const&o) { // copy-swap idiom
    TypeErased tmp(o);
    this->swap(tmp);
    return *this;
  }
  void swap( TypeErased& o ) {
    std::swap( pImpl, o.pImpl );
  }
};

// You can make this a template on other types, but I'll omit it, as it just fuzzies things up:
struct C {
  C(A a): erased(a) {}
  C(B b): erased(b) {}
  int g() {
    return erased.f();
  }
  TypeErased erased;
};

not compiled, but I read it over again and got rid of most of the typos.