Are Exceptions in C++ Really Slow

Are Exceptions in C++ really slow

The main model used today for exceptions (Itanium ABI, VC++ 64 bits) is the Zero-Cost model exceptions.

The idea is that instead of losing time by setting up a guard and explicitly checking for the presence of exceptions everywhere, the compiler generates a side table that maps any point that may throw an exception (Program Counter) to the a list of handlers. When an exception is thrown, this list is consulted to pick the right handler (if any) and stack is unwound.

Compared to the typical if (error) strategy:

the Zero-Cost model, as the name implies, is free when no exceptions occur
it costs around 10x/20x an if when an exception does occur

The cost, however, is not trivial to measure:

The side-table is generally cold, and thus fetching it from memory takes a long time
Determining the right handler involves RTTI: many RTTI descriptors to fetch, scattered around memory, and complex operations to run (basically a dynamic_cast test for each handler)

So, mostly cache misses, and thus not trivial compared to pure CPU code.

Note: for more details, read the TR18015 report, chapter 5.4 Exception Handling (pdf)

So, yes, exceptions are slow on the exceptional path, but they are otherwise quicker than explicit checks (if strategy) in general.

Note: Andrei Alexandrescu seems to question this "quicker". I personally have seen things swing both ways, some programs being faster with exceptions and others being faster with branches, so there indeed seems to be a loss of optimizability in certain conditions.

Does it matter ?

I would claim it does not. A program should be written with readability in mind, not performance (at least, not as a first criterion). Exceptions are to be used when one expects that the caller cannot or will not wish to handle the failure on the spot, and pass it up the stack. Bonus: in C++11 exceptions can be marshalled between threads using the Standard Library.

This is subtle though, I claim that map::find should not throw but I am fine with map::find returning a checked_ptr which throws if an attempt to dereference it fails because it's null: in the latter case, as in the case of the class that Alexandrescu introduced, the caller chooses between explicit check and relying on exceptions. Empowering the caller without giving him more responsibility is usually a sign of good design.

In what ways do C++ exceptions slow down code when there are no exceptions thown?

There is a cost associated with exception handling on some platforms and with some compilers.

Namely, Visual Studio, when building a 32-bit target, will register a handler in every function that has local variables with non-trivial destructor. Basically, it sets up a try/finally handler.

The other technique, employed by gcc and Visual Studio targeting 64-bits, only incurs overhead when an exception is thrown (the technique involves traversing the call stack and table lookup). In cases where exceptions are rarely thrown, this can actually lead to a more efficient code, as error codes don't have to be processed.

How slow are .NET exceptions?

I'm on the "not slow" side - or more precisely "not slow enough to make it worth avoiding them in normal use". I've written two short articles about this. There are criticisms of the benchmark aspect, which are mostly down to "in real life there'd be more stack to go through, so you'd blow the cache etc" - but using error codes to work your way up the stack would also blow the cache, so I don't see that as a particularly good argument.

Just to make it clear - I don't support using exceptions where they're not logical. For instance, int.TryParse is entirely appropriate for converting data from a user. It's inappropriate when reading a machine-generated file, where failure means "The file isn't in the format it's meant to be, I really don't want to try to handle this as I don't know what else might be wrong."

When using exceptions in "only reasonable circumstances" I've never seen an application whose performance was significantly impaired by exceptions. Basically, exceptions shouldn't happen often unless you've got significant correctness issues, and if you've got significant correctness issues then performance isn't the biggest problem you face.

Does try-catch block decrease performance

TL;DR NO, exceptions are usually faster on the non-exceptional path compared to error code handling.

Well, the obvious remark is compared to what ?

Compared to not handling the error, it obviously decrease performance; but is performance worth the lack of correctness ? I would argue it is not, so let us supposed that you meant compared to an error code checked with an if statement.

In this case, it depends. There are multiple mechanisms used to implement exceptions. Actually, they can be implemented with a mechanism so close to an if statement that they end up having the same cost (or slightly higher).

In C++ though, all major compilers (gcc introduced it in 4.x serie, MSVC uses it for 64 bits code) now use the Zero-Cost Exception Model. If you read this technical paper that Need4Sleep linked to, it is listed as the table-driven approach. The idea is that for each point of the program that may throw you register in a side-table some bits and pieces that will allow you to find the right catch clause. Honestly, it is a tad more complicated implementation-wise than the older strategies, however the Zero Cost name is derived by the fact that it is free should no exception be thrown. Contrast this to a branch misprediction on a CPU. On the other hand, when an exception is thrown, then the penalty is huge because the table is stored in a cold zone (so likely requires a round-trip to RAM or worse)... but exceptions are exceptional, right ?

To sum up, with modern C++ compilers exceptions are faster than error codes, at the cost of larger binaries (due to the static tables).

For the sake of exhaustiveness, there is a 3rd alternative: abortion. Instead of propagating an error via either status or exception, it is possible to abort the process instead. This is only suitable in a restricted number of situations, however it optimizes better than either alternative.

How do exceptions work (behind the scenes) in c++

Instead of guessing, I decided to actually look at the generated code with a small piece of C++ code and a somewhat old Linux install.

class MyException
{
public:
    MyException() { }
    ~MyException() { }
};

void my_throwing_function(bool throwit)
{
    if (throwit)
        throw MyException();
}

void another_function();
void log(unsigned count);

void my_catching_function()
{
    log(0);
    try
    {
        log(1);
        another_function();
        log(2);
    }
    catch (const MyException& e)
    {
        log(3);
    }
    log(4);
}

I compiled it with g++ -m32 -W -Wall -O3 -save-temps -c, and looked at the generated assembly file.

    .file   "foo.cpp"
    .section    .text._ZN11MyExceptionD1Ev,"axG",@progbits,_ZN11MyExceptionD1Ev,comdat
    .align 2
    .p2align 4,,15
    .weak   _ZN11MyExceptionD1Ev
    .type   _ZN11MyExceptionD1Ev, @function
_ZN11MyExceptionD1Ev:
.LFB7:
    pushl   %ebp
.LCFI0:
    movl    %esp, %ebp
.LCFI1:
    popl    %ebp
    ret
.LFE7:
    .size   _ZN11MyExceptionD1Ev, .-_ZN11MyExceptionD1Ev

_ZN11MyExceptionD1Ev is MyException::~MyException(), so the compiler decided it needed a non-inline copy of the destructor.

.globl __gxx_personality_v0
.globl _Unwind_Resume
    .text
    .align 2
    .p2align 4,,15
.globl _Z20my_catching_functionv
    .type   _Z20my_catching_functionv, @function
_Z20my_catching_functionv:
.LFB9:
    pushl   %ebp
.LCFI2:
    movl    %esp, %ebp
.LCFI3:
    pushl   %ebx
.LCFI4:
    subl    $20, %esp
.LCFI5:
    movl    $0, (%esp)
.LEHB0:
    call    _Z3logj
.LEHE0:
    movl    $1, (%esp)
.LEHB1:
    call    _Z3logj
    call    _Z16another_functionv
    movl    $2, (%esp)
    call    _Z3logj
.LEHE1:
.L5:
    movl    $4, (%esp)
.LEHB2:
    call    _Z3logj
    addl    $20, %esp
    popl    %ebx
    popl    %ebp
    ret
.L12:
    subl    $1, %edx
    movl    %eax, %ebx
    je  .L16
.L14:
    movl    %ebx, (%esp)
    call    _Unwind_Resume
.LEHE2:
.L16:
.L6:
    movl    %eax, (%esp)
    call    __cxa_begin_catch
    movl    $3, (%esp)
.LEHB3:
    call    _Z3logj
.LEHE3:
    call    __cxa_end_catch
    .p2align 4,,3
    jmp .L5
.L11:
.L8:
    movl    %eax, %ebx
    .p2align 4,,6
    call    __cxa_end_catch
    .p2align 4,,6
    jmp .L14
.LFE9:
    .size   _Z20my_catching_functionv, .-_Z20my_catching_functionv
    .section    .gcc_except_table,"a",@progbits
    .align 4
.LLSDA9:
    .byte   0xff
    .byte   0x0
    .uleb128 .LLSDATT9-.LLSDATTD9
.LLSDATTD9:
    .byte   0x1
    .uleb128 .LLSDACSE9-.LLSDACSB9
.LLSDACSB9:
    .uleb128 .LEHB0-.LFB9
    .uleb128 .LEHE0-.LEHB0
    .uleb128 0x0
    .uleb128 0x0
    .uleb128 .LEHB1-.LFB9
    .uleb128 .LEHE1-.LEHB1
    .uleb128 .L12-.LFB9
    .uleb128 0x1
    .uleb128 .LEHB2-.LFB9
    .uleb128 .LEHE2-.LEHB2
    .uleb128 0x0
    .uleb128 0x0
    .uleb128 .LEHB3-.LFB9
    .uleb128 .LEHE3-.LEHB3
    .uleb128 .L11-.LFB9
    .uleb128 0x0
.LLSDACSE9:
    .byte   0x1
    .byte   0x0
    .align 4
    .long   _ZTI11MyException
.LLSDATT9:

Surprise! There are no extra instructions at all on the normal code path. The compiler instead generated extra out-of-line fixup code blocks, referenced via a table at the end of the function (which is actually put on a separate section of the executable). All the work is done behind the scenes by the standard library, based on these tables (_ZTI11MyException is typeinfo for MyException).

OK, that was not actually a surprise for me, I already knew how this compiler did it. Continuing with the assembly output:

    .text
    .align 2
    .p2align 4,,15
.globl _Z20my_throwing_functionb
    .type   _Z20my_throwing_functionb, @function
_Z20my_throwing_functionb:
.LFB8:
    pushl   %ebp
.LCFI6:
    movl    %esp, %ebp
.LCFI7:
    subl    $24, %esp
.LCFI8:
    cmpb    $0, 8(%ebp)
    jne .L21
    leave
    ret
.L21:
    movl    $1, (%esp)
    call    __cxa_allocate_exception
    movl    $_ZN11MyExceptionD1Ev, 8(%esp)
    movl    $_ZTI11MyException, 4(%esp)
    movl    %eax, (%esp)
    call    __cxa_throw
.LFE8:
    .size   _Z20my_throwing_functionb, .-_Z20my_throwing_functionb

Here we see the code for throwing an exception. While there was no extra overhead simply because an exception might be thrown, there is obviously a lot of overhead in actually throwing and catching an exception. Most of it is hidden within __cxa_throw, which must:

Walk the stack with the help of the exception tables until it finds a handler for that exception.
Unwind the stack until it gets to that handler.
Actually call the handler.

Compare that with the cost of simply returning a value, and you see why exceptions should be used only for exceptional returns.

To finish, the rest of the assembly file:

    .weak   _ZTI11MyException
    .section    .rodata._ZTI11MyException,"aG",@progbits,_ZTI11MyException,comdat
    .align 4
    .type   _ZTI11MyException, @object
    .size   _ZTI11MyException, 8
_ZTI11MyException:
    .long   _ZTVN10__cxxabiv117__class_type_infoE+8
    .long   _ZTS11MyException
    .weak   _ZTS11MyException
    .section    .rodata._ZTS11MyException,"aG",@progbits,_ZTS11MyException,comdat
    .type   _ZTS11MyException, @object
    .size   _ZTS11MyException, 14
_ZTS11MyException:
    .string "11MyException"

The typeinfo data.

    .section    .eh_frame,"a",@progbits
.Lframe1:
    .long   .LECIE1-.LSCIE1
.LSCIE1:
    .long   0x0
    .byte   0x1
    .string "zPL"
    .uleb128 0x1
    .sleb128 -4
    .byte   0x8
    .uleb128 0x6
    .byte   0x0
    .long   __gxx_personality_v0
    .byte   0x0
    .byte   0xc
    .uleb128 0x4
    .uleb128 0x4
    .byte   0x88
    .uleb128 0x1
    .align 4
.LECIE1:
.LSFDE3:
    .long   .LEFDE3-.LASFDE3
.LASFDE3:
    .long   .LASFDE3-.Lframe1
    .long   .LFB9
    .long   .LFE9-.LFB9
    .uleb128 0x4
    .long   .LLSDA9
    .byte   0x4
    .long   .LCFI2-.LFB9
    .byte   0xe
    .uleb128 0x8
    .byte   0x85
    .uleb128 0x2
    .byte   0x4
    .long   .LCFI3-.LCFI2
    .byte   0xd
    .uleb128 0x5
    .byte   0x4
    .long   .LCFI5-.LCFI3
    .byte   0x83
    .uleb128 0x3
    .align 4
.LEFDE3:
.LSFDE5:
    .long   .LEFDE5-.LASFDE5
.LASFDE5:
    .long   .LASFDE5-.Lframe1
    .long   .LFB8
    .long   .LFE8-.LFB8
    .uleb128 0x4
    .long   0x0
    .byte   0x4
    .long   .LCFI6-.LFB8
    .byte   0xe
    .uleb128 0x8
    .byte   0x85
    .uleb128 0x2
    .byte   0x4
    .long   .LCFI7-.LCFI6
    .byte   0xd
    .uleb128 0x5
    .align 4
.LEFDE5:
    .ident  "GCC: (GNU) 4.1.2 (Ubuntu 4.1.2-0ubuntu4)"
    .section    .note.GNU-stack,"",@progbits

Even more exception handling tables, and assorted extra information.

So, the conclusion, at least for GCC on Linux: the cost is extra space (for the handlers and tables) whether or not exceptions are thrown, plus the extra cost of parsing the tables and executing the handlers when an exception is thrown. If you use exceptions instead of error codes, and an error is rare, it can be faster, since you do not have the overhead of testing for errors anymore.

In case you want more information, in particular what all the __cxa_ functions do, see the original specification they came from:

Itanium C++ ABI

What is the real overhead of try/catch in C#?

I'm not an expert in language implementations (so take this with a grain of salt), but I think one of the biggest costs is unwinding the stack and storing it for the stack trace. I suspect this happens only when the exception is thrown (but I don't know), and if so, this would be decently sized hidden cost every time an exception is thrown... so it's not like you are just jumping from one place in the code to another, there is a lot going on.

I don't think it's a problem as long as you are using exceptions for EXCEPTIONAL behavior (so not your typical, expected path through the program).

Is code with try-catch still slow when nothing is thrown

The answer is that across a wide range of compilers, languages and technologies, the performance cost of a try-catch (or similar) when no exeception occurs is minimal and should be disregarded.

The reason is that (in general) all a try really does is mark the stack in a particular way, and that's a cheap operation. When an exception is raised is when the work is done. Then the system has to scan the stack looking for catch blocks and that can be expensive.

This also applies to constructs like try-catch-finally (in languages that have it). The finally block is simply executed by a normal transfer of control.

This is a rather generic answer and specific compilers may behave differently but the principle remains: try is cheap and raising exceptions is expensive. Feel free to use try blocks whenever your design needs it without fear of performance consequences.

Are Exceptions in C++ Really Slow