C++ Object Size with Virtual Methods

C++ object size with virtual methods

This is all implementation defined. I'm using VC10 Beta2. The key to help understanding this stuff (the implementation of virtual functions), you need to know about a secret switch in the Visual Studio compiler, /d1reportSingleClassLayoutXXX. I'll get to that in a second.

The basic rule is the vtable needs to be located at offset 0 for any pointer to an object. This implies multiple vtables for multiple inheritance.

Couple questions here, I'll start at the top:

Does it mean that only one vptr is there even both of class B and class A have virtual function? Why there is only one vptr?

This is how virtual functions work, you want the base class and derived class to share the same vtable pointer (pointing to the implementation in the derived class.

It seems that in this case, two vptrs are in the layout.....How does this happen? I think the two vptrs one is for class A and another is for class B....so there is no vptr for the virtual function of class C?

This is the layout of class C, as reported by /d1reportSingleClassLayoutC:

class C size(20):
        +---
        | +--- (base class A)
 0      | | {vfptr}
 4      | | a
        | +---
        | +--- (base class B)
 8      | | {vfptr}
12      | | b
        | +---
16      | c
        +---

You are correct, there are two vtables, one for each base class. This is how it works in multiple inheritance; if the C* is casted to a B*, the pointer value gets adjusted by 8 bytes. A vtable still needs to be at offset 0 for virtual function calls to work.

The vtable in the above layout for class A is treated as class C's vtable (when called through a C*).

The sizeof B is 16 bytes -------------- Without virtual it should be 4 + 4 + 4 = 12. why there is another 4 bytes here? What's the layout of class B ?

This is the layout of class B in this example:

class B size(20):
        +---
 0      | {vfptr}
 4      | {vbptr}
 8      | b
        +---
        +--- (virtual base A)
12      | {vfptr}
16      | a
        +---

As you can see, there is an extra pointer to handle virtual inheritance. Virtual inheritance is complicated.

The sizeof D is 32 bytes -------------- it should be 16(class B) + 12(class C) + 4(int d) = 32. Is that right?

No, 36 bytes. Same deal with the virtual inheritance. Layout of D in this example:

class D size(36):
        +---
        | +--- (base class B)
 0      | | {vfptr}
 4      | | {vbptr}
 8      | | b
        | +---
        | +--- (base class C)
        | | +--- (base class A)
12      | | | {vfptr}
16      | | | a
        | | +---
20      | | c
        | +---
24      | d
        +---
        +--- (virtual base A)
28      | {vfptr}
32      | a
        +---

My question is , why there is an extra space when virtual inheritance is applied?

Virtual base class pointer, it's complicated. Base classes are "combined" in virtual inheritance. Instead of having a base class embedded into a class, the class will have a pointer to the base class object in the layout. If you have two base classes using virtual inheritance (the "diamond" class hierarchy), they will both point to the same virtual base class in the object, instead of having a separate copy of that base class.

What's the underneath rule for the object size in this case?

Important point; there are no rules: the compiler can do whatever it needs to do.

And a final detail; to make all these class layout diagrams I am compiling with:

cl test.cpp /d1reportSingleClassLayoutXXX

Where XXX is a substring match of the structs/classes you want to see the layout of. Using this you can explore the affects of various inheritance schemes yourself, as well as why/where padding is added, etc.

sizeof class with int , function, virtual function in C++?

First off, a virtual function is not a pointer with 8 bytes. In C++ nothing but sizeof(char) is guaranteed to be any number of bytes.

Second, only the first virtual function in a class increases its size (compiler-dependent, but on most - if not all - it's like this). All subsequent methods do not. Non-virtual functions do not affect the class's size.

This happens because a class instance doesn't hold pointers to methods themselves, but to a virtual function table, which is one per class.

So if you had:

class A
{
   virtual void foo();
}

and

class B
{
   virtual void goo();
   virtual void test();
   static void m();
   void x();
}

you would have sizeof(A) == sizeof(B).

And now:

Why siszeof(A) is 1 and sizeof(C) is 1 too ?

A and C have size 1 just because it's not allowed for a class to be of size 0. The functions have nothing to do with it. It's just a dummy byte.

Why siszeof(H) is 16 but sizeof(G) is 4 ?

G has only one member that accounts for memory - the int. And on your platform, sizeof(int) == 4. H, besides the int, also has a pointer to the vftable (virtual function table, see above). The size of this, size of int and allignment are compiler specific.

Why siszeof(E) is 16 but sizeof(F) is 4 ?

Explained above - non virtual methods don't take up memory in the class.

Why siszeof(D) is 8 but sizeof(E) is 16 ?

D only contains the vftable pointer which is apparently 8 bytes on your platform. E also has an int, and the vftable is aligned to 8 bytes. So it's something like:

class E

4 bytes for int |  4 padding bytes  |  8 bytes for vftable pointer  | 
| x | x | x | x |    |    |    |    | v | v | v | v | v | v | v | v |

Virtual inheritance in C++, size of the grand child's object is heavy?

The most common way to implement virtual function and virtual inheritance is through virtual tables (or vtables). These are added as invisible member variables of the classes, adding to the size.

The vtables are usually stored separately which means the invisible member will be a pointer to the table, and on a 64-bit system the pointer size is typically 8 bytes (64 bits), and since you have two virtual classes you have two pointers leading to 16 bytes of extra data.

As for the two bytes in the second case, it is probably because an object can't really be empty. To be able to get an objects size, and more importantly to be able to place object in memory, they need to take up some space, typically a single byte will be enough. If you create an instance of base and gets that instances size it will probably be one.

Why just two bytes instead of just one? You have to check what the compiler does, but the double-inheritance might have something to do with it.

Since the virtual classes already have a size through their invisible vtable pointers, they don't need the extra "empty class" padding.

how to determine sizeof class with virtual functions?

This is of course implementation-dependent. And it would make a terrible interview question. A good C++ programmer can just trust sizeof to be right and let the compiler worry about those vtable things.

But what's going on here is that a typical vtable-based implementation needs two vtables in objects of class C or D. Each base class needs its own vtable. The new virtual methods added by C and D can be handled by extending the vtable format from one base class, but the vtables used by A and B can't be combined.

In pseudo-C-code, here's how a most derived object of type D looks on my implementation (g++ 4.4.5 Linux x86):

void* D_vtable_part1[] = { (void*) 0, &D_typeinfo, &A::f1, &C::f3, &D::f4 };
void* D_vtable_part2[] = { (void*) -4, &D_typeinfo, &B::f2 };

struct D {
  void** vtable_A;
  void** vtable_B;
};

D d = { D_vtable_part1 + 1, D_vtable_part2 + 1 };

reduce size of object (wasted) in Multi virtual inheritance

EDIT: based on latest update to the question and some chatting

Here's the most compact maintaining the virtual in all your classes.

#include <iostream>
#include <vector>

using namespace std;

struct BaseFields {
    int entityId{};
    int16_t componentId{};
    int8_t typeId{};
    int16_t hpIdx;
    int16_t flyPowerIdx;
};

vector<int> hp; // this will contain all the hit points, dynamically resizable, logic up to you
vector<float> flyPower; // this will contain all the fly powers, dynamically resizable, logic up to you

class BaseComponent {
public: // or protected
    BaseFields data;
};
class HpOO : public virtual BaseComponent {
public:
    void damage() {
        hp[data.hpIdx] -= 1;
    }
};
class FlyableOO : public virtual BaseComponent {
public:
    void addFlyPower(float power) {
        flyPower[data.hpIdx] += power;
    }
};
class BirdOO : public virtual HpOO, public virtual FlyableOO {
public:
    void suicidalFly() {
        damage();
        addFlyPower(5);
    }
};

int main (){
    std::cout<<"Base="<<sizeof(BaseComponent)<<std::endl; // 12
    std::cout<<"C="<<sizeof(HpOO)<<std::endl; // 24
    std::cout<<"D="<<sizeof(FlyableOO)<<std::endl; // 24
    std::cout<<"E="<<sizeof(BirdOO)<<std::endl; // 32
}

much smaller class size version dropping all the virtual class stuff:

#include <iostream>
#include <vector>

using namespace std;

struct BaseFields {
};

vector<int> hp; // this will contain all the hit points, dynamically resizable, logic up to you
vector<float> flyPower; // this will contain all the fly powers, dynamically resizable, logic up to you

class BaseComponent {
public: // or protected
    int entityId{};
    int16_t componentId{};
    int8_t typeId{};
    int16_t hpIdx;
    int16_t flyPowerIdx;
protected:
    void damage() {
        hp[hpIdx] -= 1;
    };
    void addFlyPower(float power) {
        flyPower[hpIdx] += power;
    }
    void suicidalFly() {
        damage();
        addFlyPower(5);
    };
};
class HpOO : public BaseComponent {
public:
    using BaseComponent::damage;
};
class FlyableOO : public BaseComponent {
public:
    using BaseComponent::addFlyPower;
};
class BirdOO : public BaseComponent {
public:
    using BaseComponent::damage;
    using BaseComponent::addFlyPower;
    using BaseComponent::suicidalFly;
};

int main (){
    std::cout<<"Base="<<sizeof(BaseComponent)<<std::endl; // 12
    std::cout<<"C="<<sizeof(HpOO)<<std::endl; // 12
    std::cout<<"D="<<sizeof(FlyableOO)<<std::endl; // 12
    std::cout<<"E="<<sizeof(BirdOO)<<std::endl; // 12
    // accessing example
    constexpr int8_t BirdTypeId = 5;
    BaseComponent x;
    if( x.typeId == BirdTypeId ) {
        auto y = reinterpret_cast<BirdOO *>(&x);
        y->suicidalFly();
    }
}

this example assumes your derived classes do not have overlapping functionalities with diverging effects, if you have those you have to add virtual functions to your base class for an extra overhead of 12 bytes (or 8 if you pack the class).

and quite possibly the smallest version still maintaining the virtuals

#include <iostream>
#include <vector>

using namespace std;

struct BaseFields {
    int entityId{};
    int16_t componentId{};
    int8_t typeId{};
    int16_t hpIdx;
    int16_t flyPowerIdx;
};

#define PACKED [[gnu::packed]]

vector<int> hp; // this will contain all the hit points, dynamically resizable, logic up to you
vector<float> flyPower; // this will contain all the fly powers, dynamically resizable, logic up to you

vector<BaseFields> baseFields;

class PACKED BaseComponent {
public: // or protected
    int16_t baseFieldIdx{};
};
class PACKED HpOO : public virtual BaseComponent {
public:
    void damage() {
        hp[baseFields[baseFieldIdx].hpIdx] -= 1;
    }
};
class PACKED FlyableOO : public virtual BaseComponent {
public:
    void addFlyPower(float power) {
        flyPower[baseFields[baseFieldIdx].hpIdx] += power;
    }
};
class PACKED BirdOO : public virtual HpOO, public virtual FlyableOO {
public:
    void suicidalFly() {
        damage();
        addFlyPower(5);
    }
};

int main (){
    std::cout<<"Base="<<sizeof(BaseComponent)<<std::endl; // 2
    std::cout<<"C="<<sizeof(HpOO)<<std::endl; // 16 or 10
    std::cout<<"D="<<sizeof(FlyableOO)<<std::endl; // 16 or 10
    std::cout<<"E="<<sizeof(BirdOO)<<std::endl; // 24 or 18
}

the first number is for unpacked structure, second packed

You can also pack the hpIdx and flyPowerIdx into the entityId using the union trick:

union {
    int32_t entityId{};
    struct {
    int16_t hpIdx;
    int16_t flyPowerIdx;
    };
};

in the above example if not using packing and moving the whole BaseFields structure into the BaseComponent class the sizes remain the same.

END EDIT

Virtual inheritance just adds one pointer size to the class, plus alignment of the pointer (if needed). You can't get around that if you actually need a virtual class.

The question you should be asking yourself is whether you actually need it. Depending on your access methods to this data that might not be the case.

Considering you need virtual inheritance but all common methods that need to be callable from all your classes you can have a virtual base class and use a bit less space than your original design in the following way:

class Base{
    public: int id=0;
    virtual ~Base();
    // virtual void Function();

};
class B : public  Base{
    public: int fieldB=0;
    // void Function() override;
};
class C : public  B{
    public: int fieldC=0;
};
class D : public  B{
    public: int fieldD=0;
};
class E : public  C, public  D{

};

int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; //16
    std::cout<<"B="<<sizeof(B)<<std::endl; // 16
    std::cout<<"C="<<sizeof(C)<<std::endl; // 24
    std::cout<<"D="<<sizeof(D)<<std::endl; // 24
    std::cout<<"E="<<sizeof(E)<<std::endl; // 48
}

In the case that there are cache misses but the CPU still has power to process the results you can furter decrease the size by using compiler-specific instructions to make the data structure as small as possible (next example works in gcc):

#include<iostream>

class [[gnu::packed]] Base {
    public:
    int id=0;
    virtual ~Base();
    virtual void bFunction() { /* do nothing */ };
    virtual void cFunction() { /* do nothing */ }
};
class [[gnu::packed]] B : public Base{
    public: int fieldB=0;
    void bFunction() override { /* implementation */ }
};
class [[gnu::packed]] C : public B{
    public: int fieldC=0;
    void cFunction() override { /* implementation */ }
};
class [[gnu::packed]] D : public B{
    public: int fieldD=0;
};
class [[gnu::packed]] E : public C, public D{

};

int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; // 12
    std::cout<<"B="<<sizeof(B)<<std::endl; // 16
    std::cout<<"C="<<sizeof(C)<<std::endl; // 20
    std::cout<<"D="<<sizeof(D)<<std::endl; // 20
    std::cout<<"E="<<sizeof(E)<<std::endl; //40
}

saving an additional 8 bytes at the price of possibly some CPU overhead (but if memory is the issue might help).

Additionally if there is really a single function you are calling for each of your classes you should only have that as a single function which you override whenever necessary.

#include<iostream>

class [[gnu::packed]] Base {
public:
    virtual ~Base();
    virtual void specificFunction() { /* implementation for Base class */ };
    int id=0;
};

class [[gnu::packed]] B : public Base{
public:
    void specificFunction() override { /* implementation for B class */ }
    int fieldB=0;
};

class [[gnu::packed]] C : public B{
public:
    void specificFunction() override { /* implementation for C class */ }
    int fieldC=0;
};

class [[gnu::packed]] D : public B{
public:
    void specificFunction() override { /* implementation for D class */ }
    int fieldD=0;
};

class [[gnu::packed]] E : public C, public D{
    void specificFunction() override {
        // implementation for E class, example:
        C::specificFunction();
        D::specificFunction();
    }
};

This would also allow you to avoid having to figure out what class which object is before calling the appropriate function.

Furthermore, assuming your original virtual class inheritance idea is what works best for your application you could restructure your data so that it's more easily accessible for caching purposes while also decreasing the size of your classes and having your functions accessible at the same time:

#include <iostream>
#include <array>

using namespace std;

struct BaseFields {
    int id{0};
};

struct BFields {
    int fieldB;
};

struct CFields {
    int fieldB;
};

struct DFields {
    int fieldB;
};

array<BaseFields, 1024> baseData;
array<BaseFields, 1024> bData;
array<BaseFields, 1024> cData;
array<BaseFields, 1024> dData;

struct indexes {
    uint16_t baseIndex; // index where data for Base class is stored in baseData array
    uint16_t bIndex; // index where data for B class is stored in bData array
    uint16_t cIndex;
    uint16_t dIndex;
};

class Base{
    indexes data;
};
class B : public virtual Base{
    public: void bFunction(){
        //do something about "fieldB"
    }
};
class C : public virtual B{
    public: void cFunction(){
        //do something about "fieldC"
    }
};
class D : public virtual B{
};
class E : public virtual C, public virtual D{};

int main (){
    std::cout<<"Base="<<sizeof(Base)<<std::endl; // 8
    std::cout<<"B="<<sizeof(B)<<std::endl; // 16
    std::cout<<"C="<<sizeof(C)<<std::endl; // 16
    std::cout<<"D="<<sizeof(D)<<std::endl; // 16
    std::cout<<"E="<<sizeof(E)<<std::endl; // 24
}

Obviously this is just an example and it assumes you don't have more than 1024 objects at a point, you can increase that number but above 65536 you'd have to use a bigger int to store them, also below 256 you can use uint8_t to store the indexes.

Furthermore if one of the structures above adds very little overhead to it's parent you could reduce the number of arrays you use to store the data, if there's very little difference in the size of objects you can just store all the data in a single structure and have more localized memory accesses. That all depends on your application so I can't give more advice here other than to benchmark what works best for your case.

Have fun and enjoy C++.

size of derived class in virtual inheritance

Virtual inheritance means, that the virtual base classes only exist once instead of multiple times. That is why the 8 bytes from ClassA are only in ClassD once. Virtual inheritance itself requires a certain overhead and hence you get an additional pointer. The exact implementation and therefore the exact overhead is not specified by the C++ standard and may vary depending on the hierarchy you are creating.

Why does virtual keyword increase the size of derived a class?

The point of virtual inheritance is to allow sharing of base classes. Here's the problem:

struct base { int member; virtual void method() {} };
struct derived0 : base { int d0; };
struct derived1 : base { int d1; };
struct join : derived0, derived1 {};
join j;
j.method();
j.member;
(base *)j;
dynamic_cast<base *>(j);

The last 4 lines are all ambiguous. You have to explicitly whether you want the base inside the derived0, or the base inside derived1.

If you change the second and third line as follows, the problem goes away:

struct derived0 : virtual base { int d0; };
struct derived1 : virtual base { int d1; };

Your j object now only has one copy of base, not two, so the last 4 lines stop being ambiguous.

But think about how that has to be implemented. Normally, in a derived0, the d0 comes right after the m, and in a derived1, the d1 comes right after the m. But with virtual inheritance, they both share the same m, so you can't have both d0 and d1 come right after it. So you're going to need some form of extra indirection. That's where the extra pointer comes from.

If you want to know exactly what the layout is, it depends on your target platform and compiler. Just "gcc" isn't enough. But for many modern non-Windows targets, the answer is defined by the Itanium C++ ABI, which is documented at http://mentorembedded.github.com/cxx-abi/abi.html#vtable.

Size of class with virtual function

You're working on a platform where pointers are aligned to 8 bytes. Since the virtual table pointer is typically the first thing in the layout of an object, it too must be aligned to 8 bytes. So padding 4 bytes are inserted after the int member, that's why you get a size of 16 (8 bytes for the vf table pointer, 4 for the int and 4 padding bytes).

C++ Object Size with Virtual Methods