Handle Sigsegv in Linux

How to write a signal handler to catch SIGSEGV?

When your signal handler returns (assuming it doesn't call exit or longjmp or something that prevents it from actually returning), the code will continue at the point the signal occurred, reexecuting the same instruction. Since at this point, the memory protection has not been changed, it will just throw the signal again, and you'll be back in your signal handler in an infinite loop.

So to make it work, you have to call mprotect in the signal handler. Unfortunately, as Steven Schansker notes, mprotect is not async-safe, so you can't safely call it from the signal handler. So, as far as POSIX is concerned, you're screwed.

Fortunately on most implementations (all modern UNIX and Linux variants as far as I know), mprotect is a system call, so is safe to call from within a signal handler, so you can do most of what you want. The problem is that if you want to change the protections back after the read, you'll have to do that in the main program after the read.

Another possibility is to do something with the third argument to the signal handler, which points at an OS and arch specific structure that contains info about where the signal occurred. On Linux, this is a ucontext structure, which contains machine-specific info about the $PC address and other register contents where the signal occurred. If you modify this, you change where the signal handler will return to, so you can change the $PC to be just after the faulting instruction so it won't re-execute after the handler returns. This is very tricky to get right (and non-portable too).

edit

The ucontext structure is defined in <ucontext.h>. Within the ucontext the field uc_mcontext contains the machine context, and within that, the array gregs contains the general register context. So in your signal handler:

ucontext *u = (ucontext *)unused;
unsigned char *pc = (unsigned char *)u->uc_mcontext.gregs[REG_RIP];

will give you the pc where the exception occurred. You can read it to figure out what instruction it
was that faulted, and do something different.

As far as the portability of calling mprotect in the signal handler is concerned, any system that follows either the SVID spec or the BSD4 spec should be safe -- they allow calling any system call (anything in section 2 of the manual) in a signal handler.

Both registering signal handler for SIGSEGV and still being able to create full crash dump from OS

You can reset the sigaction after having handled the signal. Then the faulting instruction will re-run after returning from the handler, and fault again, leading to core dump.

Here's an example:

#include <signal.h>
#include <unistd.h>

struct sigaction oldSA;
void handler(int signal)
{
const char msg[] = "Caught, should dump core now\n";
write(STDERR_FILENO, msg, sizeof msg - 1);

sigaction(SIGSEGV, &oldSA, NULL);
}

int main()
{
struct sigaction sa={0};
sa.sa_handler=handler;
sigaction(SIGSEGV, &sa, &oldSA);

int* volatile p=NULL;
*p=5; // cause segfault
}

Example run:

$ gcc test.c -o test && ./test
Caught, should dump core now
Segmentation fault (core dumped)

What happens if I catch SIGSEGV and the signal handler causes another SIGSEGV?

By default, while signal is being handled it is masked, so it can't be triggered recursively. If masked signal is triggered by program execution (invalid memory access, segfault, division by 0 etc.), the behavior is undefined:

If SIGBUS, SIGFPE, SIGILL, or SIGSEGV are generated while they are
blocked, the result is undefined, unless the signal was generated by
kill(2), sigqueue(3), or raise(3).

On my system, it causes process to crash.

With SA_NODEFER there is no masking, so signal can be handled recursively until stack overflows. And adding SA_RESETHAND would restore default action (crash for SIGSEGV).

I adapted your example to simple testing program, so you can verify this behavior:

#include<signal.h>
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>

volatile char *ptr;

static void DisasterSignals(int signal)
{
/* We cannot save the situation, the purpose of catching the signal is
only to do something clever to aid debugging before we go. */
write(1, "11\n", 3);
*ptr = 1;
write(1, "13\n", 3);
abort(); /* This should give us the expected core dump (if we survive to this point) */
}

struct sigaction sa = {}; /* initialised to all zero (I vote for GCC style breach of standard here) */

int main()
{
sa.sa_handler = DisasterSignals;
sa.sa_flags = /*SA_RESETHAND | */SA_NODEFER; /* To have or have not */
sigaction(SIGSEGV, &sa, NULL);

write(1, "25\n", 3);
*ptr = 1;
}

Why i am getting continuous SIGSEGV in the below C code

The signal handler is returning to instruction that triggered it namely *a = 5 which is causing it to loop.

You have several problems including the use of printf inside a signal handler.

There are safe and not-safe ways of dealing with this

NOTES

Using signal(2) is not recommended for signal handling in general.

Handling SIGSEGV is even more complicated because of the way the signal semantics work. Quoting from the man page:

The only portable use of signal() is to set a signal's disposition to SIG_DFL or SIG_IGN. The semantics when using signal()
to establish a signal handler vary across
systems (and POSIX.1 explicitly permits this variation); do not use it for this purpose.

POSIX.1 solved the portability mess by specifying sigaction(2), which provides explicit control of the semantics when a
signal handler is invoked; use that interface instead of signal().

So the first thing you should do is use sigaction.

Next, handling SIGSEGV is a weird beast:

How to write a signal handler to catch SIGSEGV?

and

Does linux allow any system call to be made from signal handlers?

have good answers and get into specific details. There are external links in some of the answers given there.

How to do this using signal(2)

Well :-) let's say you want to use signal(2) and you want to play with this in a weird way....

You can use sigjmpset and siglongjmp.

sigjmpset marks a point where siglongjmp should jump to. The first time sigjmpset is called (to set the point) it returns 0. When siglongjmp jumps to it, (which means it gets called again as a result of the long jump), it returns 1.

Which means we can do this:

#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <setjmp.h>

sigjmp_buf env;
int sigsav;

void sighandler(int signum)
{

const char msg[] = "Skipping signal\n";
write(2, msg, sizeof(msg));
siglongjmp(env, sigsav);
}

int main()
{
int *a = NULL;

signal(SIGSEGV, sighandler);
if(!sigsetjmp(env, sigsav)) {
printf("setting value of a\n");
*a = 5;
}
else {
printf("returned to sigsetjmp, but now we skip it!\n");
}
return 0;
}

Get the register causing a segmentation fault in a signal handler

The load/store address might not be in any single register; it could the result of an addressing mode like [rdi + rax*4 + 100] or something.

There is no easy solution to print what a full debugger would, other than running your program under a debugger to catch the fault in the first place, like a normal person. Or let it generate a coredump for you to analyze offline, if you need to debug crashes that happened on someone else's system.

The Linux kernel chooses to dump instruction bytes starting at the code address of the fault (or actually somewhat before it for context), and the contents of all registers. Disassembly to see the faulting instruction can be done after the fact, from the crashlog, along with seeing register contents, without needing to include a disassembler in the kernel itself. See What is "Code" in Linux Kernel crash messages? for an example of what Linux does, and of manually picking it apart instead of using decodecode.

When will Linux kernel reset the signal handler for SIGSEGV to SIG_DFL?

It depends on how you register the signal handler.
With sigaction and without the SA_RESETHAND flag, there will be no resetting to SIG_DFL (although returning from a signal handler run in response to a SIGSEGV delivered due to a segmentation fault is technically UB).
With SA_RESETHAND it will get reset, and if you register the handler with signal, then whether the handler will be reset or not is unspecified (so don't use signal()).

Example:

#include <signal.h>
#include <unistd.h>

int volatile*a;
void h(int Sig) { write(1,"h\n", 2); }
int main()
{
//sigaction(SIGSEGV,&(struct sigaction){.sa_handler=h}, 0); //won't reset the handler, will likely loop
sigaction(SIGSEGV,&(struct sigaction){.sa_handler=h,.sa_flags=SA_RESETHAND}, 0); //will reset the handler
//signal(SIGSEGV,h); //may or may not reset the handler
*a=1;
return 0;
}

Segfault in SIGSEGV handler

From the man page for sigaction(2):

#include <signal.h>

int sigaction(int signum, const struct sigaction *act, struct sigaction
*oldact);

....

The sigaction structure is defined as something like

          struct sigaction {
void (*sa_handler)(int);
void (*sa_sigaction)(int, siginfo_t *, void *);
sigset_t sa_mask;
int sa_flags;
void (*sa_restorer)(void);
}

....

sa_mask gives a mask of signals which should be blocked during
execu- tion of the signal handler. In addition, the signal which
triggered the handler will be blocked, unless the SA_NODEFER flag is
used.

Because you didn't set the SA_NODEFER flag when setting up your signal handler, it won't get called again if another segfault occurs while still in the signal handler. Once you exit, the signal which was previously blocked will then be delivered.



Related Topics



Leave a reply



Submit