How Do Exceptions Work (Behind the Scenes) in C++

Instead of guessing, I decided to actually look at the generated code with a small piece of C++ code and a somewhat old Linux install.

class MyException
MyException() { }
~MyException() { }

void my_throwing_function(bool throwit)
if (throwit)
throw MyException();

void another_function();
void log(unsigned count);

void my_catching_function()
catch (const MyException& e)

I compiled it with g++ -m32 -W -Wall -O3 -save-temps -c, and looked at the generated assembly file.

    .file   "foo.cpp"
.section .text._ZN11MyExceptionD1Ev,"axG",@progbits,_ZN11MyExceptionD1Ev,comdat
.align 2
.p2align 4,,15
.weak _ZN11MyExceptionD1Ev
.type _ZN11MyExceptionD1Ev, @function
pushl %ebp
movl %esp, %ebp
popl %ebp
.size _ZN11MyExceptionD1Ev, .-_ZN11MyExceptionD1Ev

_ZN11MyExceptionD1Ev is MyException::~MyException(), so the compiler decided it needed a non-inline copy of the destructor.

.globl __gxx_personality_v0
.globl _Unwind_Resume
.align 2
.p2align 4,,15
.globl _Z20my_catching_functionv
.type _Z20my_catching_functionv, @function
pushl %ebp
movl %esp, %ebp
pushl %ebx
subl $20, %esp
movl $0, (%esp)
call _Z3logj
movl $1, (%esp)
call _Z3logj
call _Z16another_functionv
movl $2, (%esp)
call _Z3logj
movl $4, (%esp)
call _Z3logj
addl $20, %esp
popl %ebx
popl %ebp
subl $1, %edx
movl %eax, %ebx
je .L16
movl %ebx, (%esp)
call _Unwind_Resume
movl %eax, (%esp)
call __cxa_begin_catch
movl $3, (%esp)
call _Z3logj
call __cxa_end_catch
.p2align 4,,3
jmp .L5
movl %eax, %ebx
.p2align 4,,6
call __cxa_end_catch
.p2align 4,,6
jmp .L14
.size _Z20my_catching_functionv, .-_Z20my_catching_functionv
.section .gcc_except_table,"a",@progbits
.align 4
.byte 0xff
.byte 0x0
.byte 0x1
.uleb128 .LEHB0-.LFB9
.uleb128 .LEHE0-.LEHB0
.uleb128 0x0
.uleb128 0x0
.uleb128 .LEHB1-.LFB9
.uleb128 .LEHE1-.LEHB1
.uleb128 .L12-.LFB9
.uleb128 0x1
.uleb128 .LEHB2-.LFB9
.uleb128 .LEHE2-.LEHB2
.uleb128 0x0
.uleb128 0x0
.uleb128 .LEHB3-.LFB9
.uleb128 .LEHE3-.LEHB3
.uleb128 .L11-.LFB9
.uleb128 0x0
.byte 0x1
.byte 0x0
.align 4
.long _ZTI11MyException

Surprise! There are no extra instructions at all on the normal code path. The compiler instead generated extra out-of-line fixup code blocks, referenced via a table at the end of the function (which is actually put on a separate section of the executable). All the work is done behind the scenes by the standard library, based on these tables (_ZTI11MyException is typeinfo for MyException).

OK, that was not actually a surprise for me, I already knew how this compiler did it. Continuing with the assembly output:

.align 2
.p2align 4,,15
.globl _Z20my_throwing_functionb
.type _Z20my_throwing_functionb, @function
pushl %ebp
movl %esp, %ebp
subl $24, %esp
cmpb $0, 8(%ebp)
jne .L21
movl $1, (%esp)
call __cxa_allocate_exception
movl $_ZN11MyExceptionD1Ev, 8(%esp)
movl $_ZTI11MyException, 4(%esp)
movl %eax, (%esp)
call __cxa_throw
.size _Z20my_throwing_functionb, .-_Z20my_throwing_functionb

Here we see the code for throwing an exception. While there was no extra overhead simply because an exception might be thrown, there is obviously a lot of overhead in actually throwing and catching an exception. Most of it is hidden within __cxa_throw, which must:

  • Walk the stack with the help of the exception tables until it finds a handler for that exception.
  • Unwind the stack until it gets to that handler.
  • Actually call the handler.

Compare that with the cost of simply returning a value, and you see why exceptions should be used only for exceptional returns.

To finish, the rest of the assembly file:

    .weak   _ZTI11MyException
.section .rodata._ZTI11MyException,"aG",@progbits,_ZTI11MyException,comdat
.align 4
.type _ZTI11MyException, @object
.size _ZTI11MyException, 8
.long _ZTVN10__cxxabiv117__class_type_infoE+8
.long _ZTS11MyException
.weak _ZTS11MyException
.section .rodata._ZTS11MyException,"aG",@progbits,_ZTS11MyException,comdat
.type _ZTS11MyException, @object
.size _ZTS11MyException, 14
.string "11MyException"

The typeinfo data.

    .section    .eh_frame,"a",@progbits
.long .LECIE1-.LSCIE1
.long 0x0
.byte 0x1
.string "zPL"
.uleb128 0x1
.sleb128 -4
.byte 0x8
.uleb128 0x6
.byte 0x0
.long __gxx_personality_v0
.byte 0x0
.byte 0xc
.uleb128 0x4
.uleb128 0x4
.byte 0x88
.uleb128 0x1
.align 4
.long .LEFDE3-.LASFDE3
.long .LASFDE3-.Lframe1
.long .LFB9
.long .LFE9-.LFB9
.uleb128 0x4
.long .LLSDA9
.byte 0x4
.long .LCFI2-.LFB9
.byte 0xe
.uleb128 0x8
.byte 0x85
.uleb128 0x2
.byte 0x4
.long .LCFI3-.LCFI2
.byte 0xd
.uleb128 0x5
.byte 0x4
.long .LCFI5-.LCFI3
.byte 0x83
.uleb128 0x3
.align 4
.long .LEFDE5-.LASFDE5
.long .LASFDE5-.Lframe1
.long .LFB8
.long .LFE8-.LFB8
.uleb128 0x4
.long 0x0
.byte 0x4
.long .LCFI6-.LFB8
.byte 0xe
.uleb128 0x8
.byte 0x85
.uleb128 0x2
.byte 0x4
.long .LCFI7-.LCFI6
.byte 0xd
.uleb128 0x5
.align 4
.ident "GCC: (GNU) 4.1.2 (Ubuntu 4.1.2-0ubuntu4)"
.section .note.GNU-stack,"",@progbits

Even more exception handling tables, and assorted extra information.

So, the conclusion, at least for GCC on Linux: the cost is extra space (for the handlers and tables) whether or not exceptions are thrown, plus the extra cost of parsing the tables and executing the handlers when an exception is thrown. If you use exceptions instead of error codes, and an error is rare, it can be faster, since you do not have the overhead of testing for errors anymore.

In case you want more information, in particular what all the __cxa_ functions do, see the original specification they came from:

  • Itanium C++ ABI

Behavior of c++ exceptions escaping into c program

From what I see about C++ exceptions, in this example which I took from MSDN, GCC seems to include the following assembly in the catch statement:

    call    __cxa_end_catch
jmp .L37
movq %rax, %rbx
call __cxa_end_catch
movq %rbx, %rax
movq %rax, %rdi
call _Unwind_Resume

Which makes use of what I can only assume are C++ library calls to functions that deal with exceptions (e.g. _Unwind_resume). So if the C code links against your library it will have to provide these symbols/functions which means that the code is going to be entering the C++ library to deal with the exceptions.

However, I don't yet know what the C++ library requires in order to do its job. I would expect it to be self contained but I'm not certain of it.

Edit: The answer to this question likely lies in the answers to the following two existing questions (and their interpretation):

  1. How is the C++ exception handling runtime implemented?
  2. How are exceptions implemented under the hood?
Edit 2: From this answer, it seems that since __cxa_throw uses a table for keeping track of available handlers. I would assume that when the table is exhausted, which in our case occurs when we enter C code, the function would call std::terminate. Hence, the C++ runtime (against which you must have linked) should take care of this for you without you needing to put up a catch all clause.

Since I'm still uncertain I will write up a test of this theory and update the answer with the results.

How is the C++ exception handling runtime implemented?

Implementations may differ, but there are some basic ideas that follow from requirements.

The exception object itself is an object created in one function, destroyed in a caller thereof. Hence, it's typically not feasible to create the object on the stack. On the other hand, many exception objects are not very big. Ergo, one can create e.g a 32 byte buffer and overflow to heap if a bigger exception object is actually needed.

As for the actual transfer of control, two strategies exist. One is to record enough information in the stack itself to unwind the stack. This is basically a list of destructors to run and exception handlers that might catch the exception. When an exception happens, run back the stack executing those destructors until you find a matching catch.

The second strategy moves this information into tables outside the stack. Now, when an exception occurs, the call stack is used to find out which scopes are entered but not exited. Those are then looked up in the static tables to determine where the thrown exception will be handled, and which destructors run in between. This means there is less exception overhead on the stack; return addresses are needed anyway. The tables are extra data, but the compiler can put them in a demand-loaded segment of the program.

How is an exception transferred to find a handler?

These are all implementation details, to be decided during the (non-trivial) process of designing an exception handling mechanism. I can only give a sketch of how one might (or might not) choose to implement this.

If you want a detailed description of one implementation, you could read the specification for the Itanium ABI used by GCC and other popular compilers.

1 - The exception object is stored in an unspecified place, which must last until the exception has been handled. Pointers or references are passed around within the exception handling code like any other variable, before being passed to the handler (if it takes a reference) by some mechanism similar to passing a function argument.

2 - There are two common approaches: a static data structure mapping the program location to information about the stack frame; or a dynamic stack-like data structure containing information about active handlers and non-trivial stack objects that need destroying.

In the first case, on throwing it will look at that information to see if there are any local objects to destroy, and any local handlers; if not, it will find the function return address on the local stack frame and apply the same process to the calling function's stack frame until a handler is found. Once the handler is found, the CPU registers are updated to refer to that stack frame, and the program can jump to the handler's code.

In the second case it will pop entries from the stack structure, using them to tell it how to destroy stack objects, until it finds a suitable handler. Once the handler is found, and all unwound stack objects destroyed, it can use longjmp or a similar mechanism to jump to the handler.

Other approaches are possible.

3 - The exception handling code will use some kind of data structure to identify a type, allowing it to compare the type being thrown with the type for a handler. This is somewhat complicated by inheritance; the test can't be a simple comparison. I don't know the details for any particular implementation.

How does this implementation of chaining exceptions work?

Clever code - kudos to potatoswatter on this one. I think that I would have to find some way around the last item though.

  1. throw; rethrows the active exception. It is only valid if a catch block is on the stack. I can't recall where I came across that tidbit at but it was probably on SO in the context of some other question. The bare throw gives us access to the current exception by catching it in the chained_exception constructor. In other words, prev in the constructor is a reference to the exception that we are currently processing.

  2. You are correct here. This prevents double deletion.

  3. The sentinel exception, the one thrown in main, should never be deleted. The one identifying attribute of this exception is that it's link member is NULL.

  4. This is the part that I don't like but cannot think of an easy way around. The only visible chained_exception constructor can only be called when a catch block is active. IIRC, a bare throw without an active catch block is a no-no. So, the workaround is to throw in main and put all of your code in the catch block.

Now, if you try this method in multi-threaded code, make sure that you understand (4) very well. You will have to replicate this in your thread entry point.

Does all exception handling have to use dynamic lookup?

Yes, that is how exception handling works.

FWIW, exception handling as we understand it today was invented in the CLU language in the 70s and developed further in ML in the early 80s. From those it spread into other languages like C++, mostly only with variations on how exceptions are constructed and matched.

It is also worth noting that exception handling is just a special case of a more recently invented generalised mechanism called effect handlers, which is much richer and can express all sorts of other control structure, like coroutines, generators, async/await, even backtracking and more. Its main addition over exception handling is that a handler can resume the throwing computation, passing back a value. Like exception handling all its applications crucially rely on the dynamic extent of handlers.

How to handle failed methods: by using exceptions or making the methods return bool?

The main benefit with exceptions is that they are non-local. You can catch an exception several invocation layers away from where it was thrown. That way, code in between doesn't have to care about exceptions (except ensuring proper cleanup during unwinding, i.e. being exception safe), which makes it less likely that an exceptional situation gets forgotten. But this benefit comes at a price: stack unwinding is more complicated than simply returning a value. In terms of performance, the return value approach is usually simpler.

So I'd use these to choose: if for some reason the only reasonable place to deal with a problem is directly at the location where the function was called, and if you are fairly certain that every caller will include some kind of error handling code in any case, and is not likely to forget doing so, then a return value would be best. Otherwise, I'd go for an exception.

