Dual Emission of Constructor Symbols

Dual emission of constructor symbols

We'll start by declaring that GCC follows the Itanium C++ ABI.

According to the ABI, the mangled name for your Thing::foo() is easily parsed:

_Z     | N      | 5Thing  | 3foo | E          | v
prefix | nested | `Thing` | `foo`| end nested | parameters: `void`

You can read the constructor names similarly, as below. Notice how the constructor "name" isn't given, but instead a C clause:

_Z     | N      | 5Thing  | C1          | E          | i
prefix | nested | `Thing` | Constructor | end nested | parameters: `int`

But what's this C1? Your duplicate has C2. What does this mean?

Well, this is quite simple too:

  <ctor-dtor-name> ::= C1   # complete object constructor
                   ::= C2   # base object constructor
                   ::= C3   # complete object allocating constructor
                   ::= D0   # deleting destructor
                   ::= D1   # complete object destructor
                   ::= D2   # base object destructor

Wait, why is this simple? This class has no base. Why does it have a "complete object constructor" and a "base object constructor" for each?

This Q&A implies to me that this is simply a by-product of polymorphism support, even though it's not actually required in this case.
Note that c++filt used to include this information in its demangled output, but doesn't any more.
This forum post asks the same question, and the only response doesn't do any better at answering it, except for the implication that GCC could avoid emitting two constructors when polymorphism is not involved, and that this behaviour ought to be improved in the future.
This newsgroup posting describes a problem with setting breakpoints in constructors due to this dual-emission. It's stated again that the root of the issue is support for polymorphism.

In fact, this is listed as a GCC "known issue":

G++ emits two copies of constructors and destructors.
In general there are three types of constructors (and
destructors).
The complete object constructor/destructor.
The base object constructor/destructor.
The allocating constructor/deallocating destructor.
The first two are different, when virtual base classes are
involved.

The meaning of these different constructors seems to be as follows:

The "complete object constructor". It additionally constructs virtual base classes.
The "base object constructor". It creates the object itself, as well as data members and non-virtual base classes.
The "allocating object constructor". It does everything the complete object constructor does, plus it calls operator new to actually allocate the memory... but apparently this is not usually seen.

If you have no virtual base classes, [the first two] are are
identical; GCC will, on sufficient optimization levels, actually alias
the symbols to the same code for both.

Constructor and destructor assembly of c++

This is part of the ABI for your platform, and escapes the standard. Both constructors and destructors can generate multiple symbols in the binary. For example, the Itanium C++ABI will generate up to 3 constructors/destructors:

complete object constructor
base object constructor
complete object allocating constructor
deleting destructor
complete object destructor
base object destructor

The different symbols take on slightly different responsibilities as the implementation might need to do different things depending on how the object is being created/destroyed. In your particular case, the code is simple enough that all constructors might generate exactly the same code, but they need to be there to comply with the ABI, and the ABI has them to enable more complex use cases.

For example, a complete object constructor will initialize virtual bases, while the base object constructor will skip this part of construction. If there is multiple/virtual inheritance and virtual functions, the vptr in the complete object may have to jump through different sets of intermediate tables depending on how this subobject is being instantiated.

If you want an explanation other than the ABI mandates, you should take a look at the documentation for your particular ABI. You can also take a look at Inside the C++ object model that even if old, contains a good description of what the problems to solve are and some of the solutions provided.

Dual emission of constructor symbols

We'll start by declaring that GCC follows the Itanium C++ ABI.

According to the ABI, the mangled name for your Thing::foo() is easily parsed:

_Z     | N      | 5Thing  | 3foo | E          | v
prefix | nested | `Thing` | `foo`| end nested | parameters: `void`

You can read the constructor names similarly, as below. Notice how the constructor "name" isn't given, but instead a C clause:

_Z     | N      | 5Thing  | C1          | E          | i
prefix | nested | `Thing` | Constructor | end nested | parameters: `int`

But what's this C1? Your duplicate has C2. What does this mean?

Well, this is quite simple too:

  <ctor-dtor-name> ::= C1   # complete object constructor
                   ::= C2   # base object constructor
                   ::= C3   # complete object allocating constructor
                   ::= D0   # deleting destructor
                   ::= D1   # complete object destructor
                   ::= D2   # base object destructor

Wait, why is this simple? This class has no base. Why does it have a "complete object constructor" and a "base object constructor" for each?

This Q&A implies to me that this is simply a by-product of polymorphism support, even though it's not actually required in this case.
Note that c++filt used to include this information in its demangled output, but doesn't any more.
This forum post asks the same question, and the only response doesn't do any better at answering it, except for the implication that GCC could avoid emitting two constructors when polymorphism is not involved, and that this behaviour ought to be improved in the future.
This newsgroup posting describes a problem with setting breakpoints in constructors due to this dual-emission. It's stated again that the root of the issue is support for polymorphism.

In fact, this is listed as a GCC "known issue":

G++ emits two copies of constructors and destructors.
In general there are three types of constructors (and
destructors).
The complete object constructor/destructor.
The base object constructor/destructor.
The allocating constructor/deallocating destructor.
The first two are different, when virtual base classes are
involved.

The meaning of these different constructors seems to be as follows:

The "complete object constructor". It additionally constructs virtual base classes.
The "base object constructor". It creates the object itself, as well as data members and non-virtual base classes.
The "allocating object constructor". It does everything the complete object constructor does, plus it calls operator new to actually allocate the memory... but apparently this is not usually seen.

If you have no virtual base classes, [the first two] are are
identical; GCC will, on sufficient optimization levels, actually alias
the symbols to the same code for both.

Why constructor/destructor are defined like this in g++ produced assembly code?

There is no table here. .globl and .set are so-called assembler directives or pseudo ops. They signal something to the assembler, but do not necessarily result in production of actual code or data. From the docs:

.global symbol, .globl symbol

.global makes the symbol visible to ld. If you define symbol in your
partial program, its value is made available to other partial programs
that are linked with it. Otherwise, symbol takes its attributes from a
symbol of the same name from another file linked into the same
program.

.set symbol, expression

Set the value of symbol to expression. This changes symbol's value and
type to conform to expression.

So the fragment you quote just ensures that the constructor is available for linking in case it's referenced by other compile units. The only effect of it you normally see in the final ELF is the presence of those symbols in the symbol table (if it has not been stripped).

Now, you may be curious about why you have two different names for the constructor (e.g. _ZN8ComputerC1Ev and _ZN8ComputerC2Ev). The answer is somewhat complicated so I will refer you to another SO question which addresses it in some detail:

Dual emission of constructor symbols

Scala error: constructor is defined twice

You have two constructors which are essential same.

def this(name: Int) {
this(name, 5)
}

def this(sub:Int){
this(5, sub)
}

The signature of each constructor should be different, having a different variable name does not make these two constructors different.

Dual Emission of Constructor Symbols

Dual emission of constructor symbols

G++ emits two copies of constructors and destructors.

Constructor and destructor assembly of c++

Dual emission of constructor symbols

G++ emits two copies of constructors and destructors.

Why constructor/destructor are defined like this in g++ produced assembly code?

Scala error: constructor is defined twice

Related Topics

Leave a reply