Performance Penalty for Working with Interfaces in C++

Performance penalty for working with interfaces in C++?

Short Answer: No.

Long Answer:
It is not the base class or the number of ancestors a class has in its hierarchy that affects it speed. The only thing is the cost of a method call.

A non virtual method call has a cost (but can be inlined)

A virtual method call has a slightly higher cost as you need to look up the method to call before you call it (but this is a simple table look up not a search). Since all methods on an interface are virtual by definition there is this cost.

Unless you are writing some hyper speed sensitive application this should not be a problem. The extra clarity that you will recieve from using an interface usually makes up for any perceived speed decrease.

C/C++: Is there a performance penalty for using the adapter pattern?

If your insertEgde just forwards the call to stinger_insert_edge_pair there would (most probably) be no difference in code generated between the plain call to stinger_insert_edge_pair and g->insertEdge (provided you remove the virtual specifier).
Comparing the assembly code that is generated through the plain call and adapter call would give a fair input on the overhead your adapter is bring in.

Does insertEdge have to be virtual? Are you planning to have subclasses of Graph? But again, cost of virtual function call is almost negligible compared to real cost the function execution itself.

Overhead of implementing an interface

couldn't resist and tested it and it looks like almost no overhead.

Participants are:

Interface IFoo    defining a method
class Foo: IFoo implements IFoo
class Bar implements the same method as Foo, but no interface involved

so i defined

Foo realfoo = new Foo();
IFoo ifoo = new Foo();
Bar bar = new Bar();

and called the method, which does 20 string concatenations, 10,000,000 times on each variable.

realfoo:   723 Milliseconds
ifoo: 732 Milliseconds
bar: 728 Milliseconds

If the method does nothing, the actual calls stand out a bit more.

  realfoo: 48 Milliseconds
ifoo: 62 Milliseconds
bar: 49 Milliseconds

How to synchronize C & C++ libraries with minimal performance penalty?

Your wrapper itself will be inlined, however, your method calls to the C library typically will not. (This would require link-time-optimizations which are technically possible, but to AFAIK rudimentary at best in todays tools)

Generally, a function call as such is not very expensive. The cycle cost has decreased considerably over the last years, and it can be predicted easily, so the the call penalty as such is negligible.

However, inlining opens the door to more optimizations: if you have v = a + b + c, your wrapper class forces the generation of stack variables, whereas for inlined calls, the majority of the data can be kept in the FPU stack. Also, inlined code allows simplifying instructions, considering constant values, and more.

So while the measure before you invest rule holds true, I would expect some room for improvements here.


A typical solution is to bring the C implementaiton into a format that it can be used either as inline functions or as "C" body:

// V3impl.inl
void V3DECL v3_add(VECTOR3 *out, VECTOR3 lhs, VECTOR3 rhs)
{
// here you maintain the actual implementations
// ...
}

// C header
#define V3DECL
void V3DECL v3_add(VECTOR3 *out, VECTOR3 lhs, VECTOR3 rhs);

// C body
#include "V3impl.inl"

// CPP Header
#define V3DECL inline
namespace v3core {
#include "V3impl.inl"
} // namespace

class Vector3D { ... }

This likely makes sense only for selected methods with comparedly simple bodies. I'd move the methods to a separate namespace for the C++ implementation, as you will usually not need them directly.

(Note that the inline is just a compiler hint, it doesn't force the method to be inlined.
But that's good: if the code size of an inner loop exceeds the instruction cache, inlining easily hurts performance)

Whether the pass/return-by-reference can be resolved depends on the strength of your compiler, I've seen many where
foo(X * out)
forces stack variables, whereas
X foo()
does keep values in registers.

Does the usage of interfaces slow down programs?

Although Billy points out that this is a lot like the other post on SO, I think it's not exactly the same... mainly because of the way this question is worded.

Because Olga talks about a "decision", I almost thought that she was getting mixed up between using interfaces vs. using a derived class, and determining if the pointer to the object is of a particular class via dynamic_cast.

If you are talking about using dynamic_cast, then from what I understand (and this is not based on concrete performance numbers), you will get a pretty significant performance hit.

If you are talking about using interfaces, well, then I feel that the minor hit in doing a vtable lookup and extra call(s) is far outweighed by a better software design.

Performance of direct virtual call vs. interface call in C#

I think the article Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects will answer your questions. In particular, see the section *Interface Vtable Map and Interface Map-, and the following section on Virtual Dispatch.

It's probably possible for the JIT compiler to figure things out and optimize the code for your simple case. But not in the general case.

IFoo f2 = GetAFoo();

And GetAFoo is defined as returning an IFoo, then the JIT compiler wouldn't be able to optimize the call.

Performance impact of changing to generic interfaces

You're worried about performance - but do you have any grounds for that concern? My guess is that you haven't benchmarked the code at all. Always benchmark before replacing readable, clean code with more performant code.

In this case the call to Console.WriteLine will utterly dominate the performance anyway.

While I suspect there may be a theoretical difference in performance between using List<T> and IEnumerable<T> here, I suspect the number of cases where it's significant in real world apps is vanishingly small.

It's not even as if the sequence type is being used for many operations - there's a single call to GetEnumerator() which is declared to return IEnumerator<T> anyway. As the list gets larger, any difference in performance between the two will get even smaller, because it will only have any impact at all at the very start of the loop.

Ignoring the analysis though, the thing to take out of this is to measure performance before you base coding decisions on it.

As for what happens behind the scenes - you'd have to dig into the deep details of exactly what's in the metadata in each case. I suspect that in the case of an interface there's one extra level of redirection, at least in theory - the CLR would have to work out where in the target object's type the vtable for IEnumerable<T> was, and then call into the appropriate method's code. In the case of List<T>, the JIT would know the right offset into the vtable to start with, without the extra lookup. This is just based on my somewhat hazy understanding of JITting, thunking, vtables and how they apply to interfaces. It may well be slightly wrong, but more importantly it's an implementation detail.

Virtual functions and performance - C++

A good rule of thumb is:

It's not a performance problem until you can prove it.

The use of virtual functions will have a very slight effect on performance, but it's unlikely to affect the overall performance of your application. Better places to look for performance improvements are in algorithms and I/O.

An excellent article that talks about virtual functions (and more) is Member Function Pointers and the Fastest Possible C++ Delegates.



Related Topics



Leave a reply



Submit