Virtual Method Tables

Virtual method tables

The C# virtual function table works basically the same as the C++ one, so any resources which describe how the C++ virtual function table works should help you pretty well with the C# one as well.

For example, Wikipedia's description is not bad.

How is virtual function table generated in subclass

Simple answer is: it depends on the compiler. The standard does not specify how this should be implemented, only how it should behave.

In practice, implementations tend to simply have a derived vtable which begins with the same structure as the base vtable, than appends the derived class methods on the end (that is to say new methods in the derived class, not overrides).

The vtable pointer just points the the beginning of the whole table. If the object is being accessed through a base pointer type, then no one should ever look beyond the end of the base class methods.

If the pointer is of the derived type, then the same pointer will allow access further down the table to virtual methods declared in the derived class.

ADDENDUM: Multiple Inheritance

Multiple inheritance follows the same basic concepts, but quickly becomes complex for obvious reasons. But there's one important feature to bear in mind.

A multiply derived class has one vtable pointer for each of its base classes, pointing to different vtables or different locations in the same vtable (implementation dependent).

But it's important to remember it's one per immediate base class per object.

Thus if you had a multiply derived class with one int of data and three immediate base classes, the size of each object would actually be 16 bytes (on a 32-bit system; more on 64-bit). 4 for the int and four each for each of the vtable pointers. Plus, of course, the sizes of each of the base classes themselves.

That means that in C++, interfaces are not cheap. (Obviously there are no true interfaces in C++, but a base class with no data and only pure virtual methods emulates it.) Each such interface costs the size of a pointer per object.

In languages like C# and Java, where interfaces are part of the language, there is a slightly different mechanism whereby all interfaces are routed through a single vtable pointer. This is slightly slower, but means only one vtable pointer per object, however many interfaces are implemented.

I'd still follow an interface style approach in C++ for design reasons, but always be aware of this additional overhead.

(And none of this even touches on virtual inheritance.)

No Virtual Function Table for abstract class?

These "interfaces" abstract classes probably have need no user written code in any their constructors and destructors (these either have an empty body and no ctor-init-list, or simply are never user defined); let's call these pure interface classes.

[Pure interface class: concept related but not identical to Java interfaces that are (were?) defined as having zero implementation code, in any member function. But be careful with analogies, as Java interfaces inheritance semantic isn't the same as C++ abstract classes inheritance semantic.]

It means that in practice no used object ever has pure interface class type: no expression ever refers to an object with pure interface type. Hence, no vtable is ever needed, so the vtable, which may have been generated during compilation, isn't included in linked code (the linker can see the symbol of the pure interface class vtable isn't used).

Why does C++ RTTI require a virtual method table?

From a language perspective, the answer is: it doesn't. Nowhere in the C++ standard does it say how virtual functions are to be implemented. The compiler is free to make sure the correct function is called however it sees fit.

So, what would be gained by replacing the vptr (not the vtable) with an id and dropping the vtable? (replacing the vtable with an id doesn't really help anything whatsoever, once you have resolved vptr, you already know the run-time type)

How does the runtime know which function to actually call?

Consider:

template <int I>
struct A {
  virtual void foo() {}
  virtual void bar() {}
  virtual ~A() {}
};

template <int I>
struct B : A<I> {
  virtual void foo() {}
};

Suppose your compiler gives A<0> the ... lets call it vid ... 0 and A<1> the vid 1. Note that A<0> and A<1> are completely unrelated classes at this point. What happens if you say a0.foo() where a0 is an A<0>? At runtime a non-virtual function would just result in a statically dispatched call. But for a virtual function, the address of the function-to-call must be determined at runtime.

If all you had was vid 0 you'd still have to encode which function you want. This would result in a forest of if-else branches, to figure out the correct function pointer.

if (vid == 0) {
  if (fid == 0) {
    call A<0>::foo();
  } else if (fid == 1) {
    call A<0>::bar();
  } /* ... */
} else if (vid == 1) {
  if (fid == 0) {
    call A<1>::foo();
  } else if (fid == 1) {
    call A<1>::bar();
  } /* ... */
} /* ... */

This would get out of hand. Hence, the table. Add an offset that identifies the foo() function to the base of A<0>'s vtable and you have the address of the actual function to call. If you have a B<0> object on your hands instead, add the offset to that class' table's base pointer.

In theory compilers could emit if-else code for this but it turns out a pointer addition is faster and the resulting code smaller.

How are virtual functions and vtable implemented?

How are virtual functions implemented at a deep level?

From "Virtual Functions in C++":

Whenever a program has a virtual function declared, a v - table is constructed for the class. The v-table consists of addresses to the virtual functions for classes that contain one or more virtual functions. The object of the class containing the virtual function contains a virtual pointer that points to the base address of the virtual table in memory. Whenever there is a virtual function call, the v-table is used to resolve to the function address. An object of the class that contains one or more virtual functions contains a virtual pointer called the vptr at the very beginning of the object in the memory. Hence the size of the object in this case increases by the size of the pointer. This vptr contains the base address of the virtual table in memory. Note that virtual tables are class specific, i.e., there is only one virtual table for a class irrespective of the number of virtual functions it contains. This virtual table in turn contains the base addresses of one or more virtual functions of the class. At the time when a virtual function is called on an object, the vptr of that object provides the base address of the virtual table for that class in memory. This table is used to resolve the function call as it contains the addresses of all the virtual functions of that class. This is how dynamic binding is resolved during a virtual function call.

Can the vtable be modified or even directly accessed at runtime?

Universally, I believe the answer is "no". You could do some memory mangling to find the vtable but you still wouldn't know what the function signature looks like to call it. Anything that you would want to achieve with this ability (that the language supports) should be possible without access to the vtable directly or modifying it at runtime. Also note, the C++ language spec does not specify that vtables are required - however that is how most compilers implement virtual functions.

Does the vtable exist for all objects, or only those that have at least one virtual function?

I believe the answer here is "it depends on the implementation" since the spec doesn't require vtables in the first place. However, in practice, I believe all modern compilers only create a vtable if a class has at least 1 virtual function. There is a space overhead associated with the vtable and a time overhead associated with calling a virtual function vs a non-virtual function.

Do abstract classes simply have a NULL for the function pointer of at least one entry?

The answer is it is unspecified by the language spec so it depends on the implementation. Calling the pure virtual function results in undefined behavior if it is not defined (which it usually isn't) (ISO/IEC 14882:2003 10.4-2). In practice it does allocate a slot in the vtable for the function but does not assign an address to it. This leaves the vtable incomplete which requires the derived classes to implement the function and complete the vtable. Some implementations do simply place a NULL pointer in the vtable entry; other implementations place a pointer to a dummy method that does something similar to an assertion.

Note that an abstract class can define an implementation for a pure virtual function, but that function can only be called with a qualified-id syntax (ie., fully specifying the class in the method name, similar to calling a base class method from a derived class). This is done to provide an easy to use default implementation, while still requiring that a derived class provide an override.

Does having a single virtual function slow down the whole class or only the call to the function that is virtual?

This is getting to the edge of my knowledge, so someone please help me out here if I'm wrong!

I believe that only the functions that are virtual in the class experience the time performance hit related to calling a virtual function vs. a non-virtual function. The space overhead for the class is there either way. Note that if there is a vtable, there is only 1 per class, not one per object.

Does the speed get affected if the virtual function is actually overridden or not, or does this have no effect so long as it is virtual?

I don't believe the execution time of a virtual function that is overridden decreases compared to calling the base virtual function. However, there is an additional space overhead for the class associated with defining another vtable for the derived class vs the base class.

Additional Resources:

http://www.codersource.net/published/view/325/virtual_functions_in.aspx (via way back machine)

http://en.wikipedia.org/wiki/Virtual_table

http://www.codesourcery.com/public/cxx-abi/abi.html#vtable

Are virtual tables part of the C++ standard?

No they are not defined by the standard. They are instead implementation concepts, rather like a stack or a heap.

The standard is helpful in permitting the implementation of polymorphism to be carried out in a certain way, for example, the address of the first member variable of a class doesn't need to be the address of an instance of that class if that class is a polymorphic type.

Virtual Method Tables