How to Handle Sigabrt Signal

How to Handle SIGABRT signal?

As others have said, you cannot have abort() return and allow execution to continue normally. What you can do however is protect a piece of code that might call abort by a structure akin to a try catch. Execution of the code will be aborted but the rest of the program can continue. Here is a demo:

#include <csetjmp>
#include <csignal>
#include <cstdlib>
#include <iostream>

jmp_buf env;

void on_sigabrt (int signum)
{
  signal (signum, SIG_DFL);
  longjmp (env, 1);
}

void try_and_catch_abort (void (*func)(void))
{
  if (setjmp (env) == 0) {
    signal(SIGABRT, &on_sigabrt);
    (*func)();
    signal (SIGABRT, SIG_DFL);
  }
  else {
    std::cout << "aborted\n";
  }
}    

void do_stuff_aborted ()
{
  std::cout << "step 1\n";
  abort();
  std::cout << "step 2\n";
}

void do_stuff ()
{
  std::cout << "step 1\n";
  std::cout << "step 2\n";
}    

int main()
{
  try_and_catch_abort (&do_stuff_aborted);
  try_and_catch_abort (&do_stuff);
}

SIGABRT handler. Do some cleanup before crash

You may restore the default SIGABRT behavior before catching it the second time :

void mysigabort(int signum)
{
    // whatever you want
    signal(signum, SIG_DFL);
    kill(getpid(), signum); // or abort() ?
}

When does a process get SIGABRT (signal 6)?

abort() sends the calling process the SIGABRT signal, this is how abort() basically works.

abort() is usually called by library functions which detect an internal error or some seriously broken constraint. For example malloc() will call abort() if its internal structures are damaged by a heap overflow.

Keep running the program after SIGABRT c++ signal

I use a third library in my c++ program which under certain circumstances emits SIGABRT signal

If you have the source code of that library, you need to correct the bug (and the bug could be in your code).

BTW, probably SIGABRT happens because abort(3) gets indirectly called (perhaps because you violated some conventions or invariants of that library, which might use assert(3) - and indirectly call abort). I guess that in caffe the various CHECK* macros could indirectly call abort. I leave you to investigate that.

If you don't have the source code or don't have the capacity or time to fix that bug in that third party library, you should give up using that library and use something else.

In many cases, you should trust external libraries more than your own code. Probably, you are abusing or misusing that library. Read carefully its documentation and be sure that your own code calling it is using that library correctly and respects its invariants and conventions. Probably the bug is in your own code, at some other place.

I want to keep running my program

This is impossible (or very unreliable, so unreasonable). I guess that your program has some undefined behavior. Be very scared, and work hard to avoid UB.

You need to improve your debugging skills. Learn better how to use the gdb debugger, valgrind, GCC sanitizers (e.g. instrumentation options like -fsanitize=address, -fsanitize=undefined and others), etc...

You reasonably should not try to handle SIGABRT even if in principle you might (but then read carefully signal(7), signal-safety(7) and hints about handling Unix signals in Qt). I strongly recommend to avoid even trying catching SIGABRT.

How to get signal to catch SIGABRT

The behaviour you describe is as documented.

From the Visual Studio documentation for abort (emphasis mine):

By default, when an app is built with the debug runtime library, the abort routine displays an error message before SIGABRT is raised. [...] To suppress the message, use _set_abort_behavior to clear the _WRITE_ABORT_MSG flag.

How to catch SIGABRT in multithread environment?

I need to catch SIGABRT, SIGSEGV and probably others signals to prevent my process from being killed

This is an exercise in futility. After SIGABRT or SIGSEGV is raised, you (in general) have no idea about the state of the process -- it may have corrupted heap, stack, global data internal to your test framework, global data internal to the C runtime system, etc. etc. Continuing such process is exceedingly likely to continue crashing at random (correct) places in the code.

The only sane way to handle this in a test framework is to fork and have the parent process handle child error exits, report them and continue running additional tests.

SIGABRT is a thread direct signal ?

There is no such thing as "direct signal". SIGABRT may be sent to the process from outside, or it can be raised inside the process.

What happens if I only use the main thread to catch the SIGABRT (or SIGSEGV) signal?

SIGSEGV and SIGABRT (when not sent from outside) is sent to the thread which caused the invalid memory operation (or raised it).

In addition, there is no way to "only use main thread" -- sigaction is global across all threads (though you can set a thread-specific signal mask).

How to Handle Sigabrt Signal