C++ Virtual Table Layout of Mi(Multiple Inheritance)

Question on multiple inheritance, virtual base classes, and object size in C++

Let's look at the class layout of the two cases.

Without the virtual, you have two base classes ("X" and "Y") with an integer each, and each of those classes have integrated into them a "Base" base class which also has an integer. That is 4 integers, 32-bits each, totalling your 16 bytes.

Offset  Size  Type  Scope  Name
0 4 int Base a
4 4 int X x
8 4 int Base a
12 4 int Y y
16 size (Z members would come at the end)

(Edit: I've written a program in DJGPP to get the layout and tweaked the table to account for it.)

Now let's talk about virtual base classes: they replace the actual instance of the class with a pointer to a shared instance. Your "Z" class has only one "Base" class, and both instances of "X" and "Y" point to it. Therefore, you have integers in X, Y, and Z, but you only have the one Z. That means you have three integers, or 12 bytes. But X and Y also have a pointer to the shared Z (otherwise they wouldn't know where to find it). On a 32-bit machine two pointers will add an additional 8 bytes. This totals the 20 that you see. The memory layout might look something like this (I haven't verified it... the ARM has an example where the ordering is X, Y, Z, then Base):

Offset  Size        Type  Scope  Name  Value (sort of)
0 4 Base offset X ? 16 (or ptr to vtable)
4 4 int X x
8 4 Base offset Y ? 16 (or ptr to vtable)
12 4 int Y y
16 4 int Base a
20 size (Z members would come before the Base)

So the memory difference is a combination of two things: one less integer and two more pointers. Contrary to another answer, I don't believe vtables pay any (edit) direct (/edit) roll in this, since there are no virtual functions.

Edit: ppinsider has provided more information on the gcc case, in which he demonstrates that gcc implements the pointer to the virtual base class by making use of an otherwise empty vtable (i.e., no virtual functions). That way, if there were virtual functions, it wouldn't require an additional pointer in the class instance, requiring more memory. I suspect the downside is an additional indirection to get to the base class.

We might expect all compilers to do this, but perhaps not. The ARM page 225 discusses virtual base classes without mentioning vtables. Page 235 specifically addresses "virtual base classes with virtual functions" and has a diagram indicating a memory layout where there are pointers from the X and Y parts that are separate from the pointers to the vtable. I would advise anyone not to take for granted that the pointer to Base will be implemented in terms of a table.

cpp class size with virutal pointer and inheritance

No, there is only one vtable pointer in every object which uses virtual functions, independently if the virtual function was defined in the class itself or the class derives from another class which uses virtual functions.

What the compiler generates is a table of function pointers, so that every class ( not instance/object ) has its own. In your example you have a table for class A, one for class B and so on. And in every object/instance you have a vtable pointer. This pointer points only to the table. If you call a virtual function via class pointer, you have the indirection over the vtable pointer and than over the vtable itself.

As a result every instance of a class keeps only a single vtable pointer, pointing to the vtable for this class. As you can see, this results in the same size for every instance of every class you wrote.

In the case of multiple inheritance, you will get multiple vtable pointers. A more detailed answer is already given here:
vtable and multiple inheritance

BTW: That you have a vtable and a vtable pointer is not guaranteed from the standards, every compiler can do what it want if the result is what we expect from the semantic. But the double indirection via vtable pointer to pointer in the table to the function is the typical implementation.

Virtual tables memory location

First of all, Derived2 is of another type than Base1, so it needs some other information apart from the virtual functions table. Second, at least Derived2's destructor is another function than the one from Base1, so even if there were only the virtual functions in the table, that entry has to be differnt.
I am not sure about how MSVC implements RTTI on polymorphic types, but there has to be some identifcation of the type different to virtual functions, e.g. to enable dynamic_casts. So that first entry could very well be the pointer to the RTTI. I have no MSVC around at the moment, but you could try this:

struct Base {
virtual void foo() {};
virtual void bar() {};
virtual ~Base();
};

struct Derived {
virtual void foo() {};
virtual ~Derived();
};

int main() {
Base* b1 = new Base;
Base* b2 = new Derived;
};

Now inspect the first four or five elements of the __vfptr's of the two created objects, my guess is you will see one entry that is the same - it's the pointer to Base::bar. The others (Pointers to RTTI, foo and destructor) should be different.

Here comes some gueswork: Maybe you can see a different region in memory the pointers point to, since the RTTI pointers might point to the data segment, while the virtual function pointers point to code segment.

Update: there need not be an entry for RTTI in the vtable itself - it might be possible that some compilers implement RTTI just by comparing the addresses of the vtables.

Can derived classes have more than one pointer to a virtual table?

The a vtable contains the address of each virtual function for the class at a known offset.

[Remark: In practice unlike a regular class, vtables have members at negative offset, much like a pointer in a the middle of a array. That is just a convention that doesn't change implementation freedom much. Anyway the only issue is that the placement of an information in a vtable is legislated by a convention (the ABI) and compilers by following the same one produce compatible code for polymorphic classes.]

What happens when you have additional functions in a derived class? (not just the functions "inherited" from the base class)

Once you accept the idea that a pointer to a structure both points to the whole object and to its first member, you have the idea that a pointer to derived class points to a base class that is appropriately located at offset zero. So you can have the exact same pointer value, as represented as a void*, that can be used alternatively for a derived object or a base under this convention for single inheritance.

Now you can apply that to any data structure and even to a vtable which is really not a table (array of elements of the same type, or values that can be interpreted in the same way) but a record (of objects of unrelated type or meaning); you can see that a vtable for such derived class can just be derived from the vtable of its unique base in the exact same way.

(Note that if you compile C++ to C, you might run into type aliasing rules when you do such things. Of course assembly has no such problem, nor naively compiled "high level assembler" C.)

So for single inheritance the base is integrated and optimized into the derived class:

  • for data members of the instance (of a class type)
  • and for the virtual functions members, that is the data members of the vtable (or members of the meta class if you imagine one).

Note that placing the base at offset zero allows you to place vtable base at zero offset, which in turn allows you to use the same vptr but does not imply it; conversely sharing the vptr with a base implies that the base vtable is at offset zero (vtable layout = meta class level) so the base must be at offset zero (data members layout = class level).

And multiple inheritance is actually single inheritance plus, as one class is always treated as privileged: it is placed at offset zero so the pointers are the same, so the vtable can be placed at offset zero (because the pointers are the same); others bases, not so.

As we see, all but one of the inherited polymorphic classes are placed at a non zero offset in multiple inheritance. Each one carries an additional "inherited" vptr in the derived class; that (hidden) pointer member must be correctly filled by any derived constructor.

These additional vptr are for base classes that occur at non zero offset, so a pointer to an inherited base must be adjusted (add a positive constant to convert to base pointer, remove it to convert back). That a compiler needs to produce code to perform an implicit conversion is a trivial remark (converting an integer to a floating point type is a much more involved task); but here the conversion of this is between a function call on a given base type and landing in the function that is an overrider in a base or derived class: the difference is that adjustment depends on function overriding which is only known for a class (an instance of a meta type). So the vptr needs to point to distinct vtable information: one that knows how to deal with these base to derived pointer conversions.

As instances of the "meta type", vtables have all the information to do all pointers adjustment automatically. (These depend on the specific class types involved, and on no other factor.)

So at implementation level, the two types of inheritances are:

  • zero offset inheritance; sharing the vptr; called a primary base class in some vtable and ABI descriptions;
  • arbitrary offset inheritance; having another vptr; called secondary base class.

This is for the basic stuff. Virtual inheritance is a lot more subtle at the implementation level, and even the concept of primary isn't so clear, as virtual bases can be "primary" of a derived class only in some more derived classes!

Virtual multiple inheritance and casting

"Virtual" always means "determined at runtime". A virtual function is located at runtime, and a virtual base is also located at runtime. The whole point of virtuality is that the actual target in question is not knowable statically.

Therefore, it is impossible to determine the most-derived object of which you are given a virtual base at compile time, since the relationship between the base and the most-derived object is not fixed. You have to wait until you know what the actual object is before you can decide where it is in relation to the base. That's what the dynamic cast is doing.



Related Topics



Leave a reply



Submit