When Does a Process Handle a Signal

When Are Signals Handled?

Theoretically, let's say process A is running and it sent a signal to process B, When process B starts running it might never run in kernel mode those never see the signal and handle it.

Linux is a preemptive multitasking operating system. This means that the kernel gives every process a time slice and the CPU will receive a hardware interrupt in regular time intervals, which will return it to kernel-mode, so that the kernel can, for example, give a time slice to a different process.

Therefore, the situation you describe (that a process will run forever in user-mode and never reach kernel-mode) will never occur in a preemptive multitasking operating system such as Linux.

When does windows signal a process handle?

As another answer points out, the process handle gets signalled when the process has stopped execution, and the operating system might take a bit longer to release DLLs.

You are right that relying on Sleep(100) is a bad idea. You should rather wrap overwriting your DLLs in a loop like this:

BOOL UpdateDll(LPCTSTR dll_name, WHATEVER whatever) {
int tries = 150;
while (tries--) {
if (TryUpdateDll(dll_name, whatever))
return TRUE;
Sleep(200);
}
return FALSE;
}

This keeps trying to unload your DLL for 30 seconds and then gives up. 30 seconds should be enough even when the system is under heavy load, but still will protect your updater from hanging forever. (In case UpdateDll returns FALSE, be sure to present a meaningful error message to your user, stating the name of offending DLL.)

If you are messing with COM, a call to CoFreeUnusedLibraries before quitting might also be helpful. (http://msdn.microsoft.com/en-us/library/ms679712.aspx) Frankly I don't know if COM might hold on to DLLs even after your process exits, but better be safe.

Bottom line is that there's a lot of strangeness in Win32 API. You don't have to deal with every case as long as you can find an acceptable solution. Obviously Sleep(100) might break, but a 30-second polling loops seems acceptable to me.

When several signals arrive at a process, what is the order between the process handling the signals?

SIGCONT has special semantics.

Regardless of whether SIGCONT is caught, is ignored, or has default disposition, its generation will clear all pending stop signals and resume execution of a stopped process. [IEEE Std 1003.1-2017] Again, this resumption happens before any other signals are delivered, and even before SIGCONT's handler (if any) is invoked.

(This special “dispositionless” semantic makes sense. In order for a process to execute a signal handler, the process must itself be executing.)

POSIX is clearer than APUE here, saying that "[t]he default action for SIGCONT is to resume execution at the point where the process was stopped, after first handling any pending unblocked signals."

As others have mentioned, the actual order in which pending signals are delivered is implementation-specific. Linux, at least, delivers basic UNIX signals in ascending numeric order.

To demonstrate all this, consider the following code. It STOPs a process, then sends it several signals, then CONTinues it, having installed handlers for all catchable signals so we can see what is handled when:

#define _POSIX_SOURCE
#include <signal.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

static int signals[] = { SIGSTOP, SIGURG, SIGUSR1, SIGHUP, SIGCONT, 0 };

static void
handler(int signo) {
// XXX not async-signal-safe
printf("<signal %d>\n", signo);
}

int
main(int argc, char **argv) {
int *sig = signals;
struct sigaction sa = { .sa_flags = 0, .sa_handler = handler };

sigfillset(&sa.sa_mask);

sig++; // can't catch SIGSTOP
while (*sig) {
sigaction(*sig, &sa, NULL); // XXX error check
sig++;
}

if (fork() == 0) { // XXX error check
sleep(2); // faux synchronization - let parent pause()

sig = signals;
while (*sig) {
printf("sending signal %d\n", *sig);
kill(getppid(), *sig);
sig++;
}
exit(0);
}

pause();

return 0;
}

For me, this prints

sending signal 19
sending signal 23
sending signal 10
sending signal 1
sending signal 18
<signal 1>
<signal 10>
<signal 18>
<signal 23>

What happens to threads when process receives signal?

Some intro to threads and signals (signal(7))

  • Signal Disposition is per-process

The signal disposition is a per-process attribute: in a multithreaded application, the disposition of a particular signal is the same for all threads.

  • Signal may be process-directed or thread-directed

Process-directed Signals: A process-directed signal is one that is targeted at (and thus pending for) the process as a whole. A process-directed signal may be delivered to any one of the threads that do not currently have the signal blocked. If more than one of the threads has the signal unblocked, then the kernel chooses an arbitrary thread to which to deliver the signal.

Thread-directed Signals: A thread-directed signal is one that is targeted at a specific thread. The set will consist of the union of the set of pending process-directed signals and the set of signals pending for the calling thread.

  • Asynchronous and Synchronous Signal Handling

You can configure your program to tell how to deal with signals. You can ignore them (few can't be ignored), register a signal handler which will be invoked when that specific signal is received (asynchronous), or block it to deal with it later (synchronous).

Coming to your case,

"The question is: what happens to threads when signal handling function is running?"

The signal is delivered once to any thread that is configured to receive it. The thread, which is asynchronously handling the signal, stops whatever it is doing and jumps to the configured signal handler. The flow of the execution in the remaining threads is unaffected.

If threads continue running their jobs, is there a way to freeze them while debugging handler is working?

There is no standard way of doing this. You need to build your own mechanism for enabling this.

To research further, some clarity is required regarding where the debug handler is executed. In each thread or in main() or in a specific thread?

Edit

Assuming main() implements the logging functionality, below tries to implement the base minimal for the same. Comments are added which enables to walk through code and understand the implementation.

#define THREAD_MAX_COUNT 100

#include <pthread.h>
#include <semaphore.h>
#include <signal.h>
#include <sys/signalfd.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int debug;
sigset_t debug_mask;
pthread_t main_tid;

void* thread_func(void* th_data)
{
/* .... */

for ( ; ; ) {

if (debug) { // If debug procedure starts
printf("Freezing %d\n", *((int*) th_data));

pthread_kill(main_tid, SIGRTMIN); // Notify the main thread about the thread's freeze.
int signo;
sigwait(&debug_mask, &signo); // Wait till logging is done. main() will signal once it is done.

printf("Resuming %d\n", *((int*) th_data));
}

/* ... */
}

return NULL;
}

int main() {

/* Block SIGINT SIGRTMIN*/

sigset_t sigmask;
sigemptyset(&sigmask);
sigaddset(&sigmask, SIGINT);
sigaddset(&sigmask, SIGRTMIN);

pthread_sigmask(SIG_BLOCK, &sigmask, NULL);

/* Set debug variables */

debug = 0;
sigemptyset(&debug_mask);
sigaddset(&debug_mask, SIGRTMIN);
main_tid = pthread_self();

/* Get signalfd for SIGINT */

int sigfd = signalfd(-1, &sigmask, 0);
struct signalfd_siginfo sigbuf;

/* Select variable initializations */

fd_set rd_set, tr_set;
FD_ZERO(&rd_set);
FD_SET(sigfd, &rd_set);

int td_count = 0;
pthread_t tids[THREAD_MAX_COUNT];

for ( ; ; ) {
/* Wait for signal */
tr_set = rd_set;

select(sigfd + 1, &tr_set, NULL, NULL, NULL);

if (FD_ISSET(sigfd, &tr_set)) {
/* Read the pending signal */
read(sigfd, &sigbuf, sizeof(sigbuf));

/* Start logging */
debug = 1;

int signo;
for (int count = 0; count < td_count; count++) {
/* Wait for all threads to freeze */
sigwait(&debug_mask, &signo);
}

printf("Logging...\n");
sleep(3);

/* End logging and resume threads */
debug = 0;

for (int count = 0; count < td_count; count++)
pthread_kill(tids[count], SIGRTMIN);

/* Note below code is for testing purpose; Creates new thread on each interruption */
int* td_data = malloc(sizeof(int));
*td_data = td_count;

pthread_create(tids + td_count, NULL, thread_func, td_data);

td_count++;
}
}

return 0;
}

Terminal Session:

$ gcc SO.c -lpthread 
$ ./a.out
^CLogging...
^CFreezing 0
Logging...
Resuming 0
^CFreezing 0
Freezing 1
Logging...
Resuming 0
Resuming 1
^CFreezing 0
Freezing 1
Freezing 2
Logging...
Resuming 1
Resuming 0
Resuming 2
^CFreezing 2
Freezing 3
Freezing 1
Freezing 0
Logging...
Resuming 1
Resuming 3
Resuming 2
Resuming 0
^CFreezing 1
Freezing 4
Freezing 3
Freezing 0
Freezing 2
Logging...
Resuming 1
Resuming 2
Resuming 0
Resuming 4
Resuming 3
^CFreezing 3
Freezing 0
Freezing 4
Freezing 2
Freezing 5
Freezing 1
Logging...
Resuming 0
Resuming 1
Resuming 2
Resuming 5
Resuming 3
Resuming 4
^\Quit (core dumped)

Idiomatic way to handle signals in a shared library

After reviewing standards and others' implementations, I decided to do a self-answer.

Most suitable solution for libraries

Simply don't deal with the signal registration mess. Just expose a "signal handler" that must be called by the user of the library, and returns whether a signal was handled or not. Signal handlers are process-global, so they can be considered as a resource of the main executable. Libraries shouldn't deal with others' resources on their own. While this might cause some headaches to whoever is using your library, it is ultimately the most flexible solution.

I ended up with a rather simple function prototype:

LIB_EXPORT int lib_handle_signal(int signo, siginfo_t* info, void* context);

And documented that the user must call it on several signals.

Actual answers to the two concerns

Since my library's primary user is a C# executable (in which you can't write signal handlers due to the restriction to signal-safe functions) I still had to deal with the issue, except in a separate library that is rather considered to be "part of" the main executable.

Default action

The default actions for POSIX signals are actually specified in POSIX. For abnormal or normal termination handlers simply unregistering ourselves and letting the process crash is an appropriate solution, while the default ignored ones can be simply ignored.

Chaining and unloading

The simplest way to solve this issue is simply never unloading. While I haven't found a truly POSIX solution to this, there is a simple one that works on most Unices:

static void make_permanently_loaded()
{
static char a_variable_in_the_module;
// this is not POSIX but most BSDs, Linux and Mac have it
Dl_info dl_info;
memset(&dl_info, 0, sizeof(dl_info));
int res = dladdr(&a_variable_in_the_module, &dl_info);
assert(res && dl_info.dli_fname);
// Leak a reference to ourselves
void* me = dlopen(dl_info.dli_fname, RTLD_NOW | RTLD_NODELETE);
assert(me);
}

Other implementations

While I haven't really found similar problems here on SO, there are a few implementations that encountered the same problems as I did, and tried their best at handling them.

libsigsegv

libsigsegv simply discards the previous handlers and doesn't even attempt chaining to whatever was registered before:

sigaction (sig, &action, (struct sigaction *) NULL);

It also does not handle unloading, your process will abort if you unload it then cause a SIGSEGV, even if prior to loading you had a SIGSEGV handler registered.

It handles unhandled signals similar to me in the question, by unregistering itself and letting the signal happen again which will result in normal or abnormal termination.

OpenJDK / Hotspot

Java brings libjsig which hooks signal and sigaction. When the JRE is installing signal handlers, libjsig backs up the old ones. When someone else is installing signal handlers to signals that the JRE installed prior, it simply saves them new ones (and returns the previous old one). The JVM is expected to implement the actual chaining, the old handlers are only to be queried from libjsig. This approach has the advantage of being stackable - multiple different versions of libjsig may be loaded and they will work. However unfortunately a single copy of the library can only be used by a single copy of a JRE (or similar), so as a library implementer you can't use it if you aren't sure that no one will attempt loading a JRE into the same process. However you can "fork" it and simply make a renamed copy of it for your purposes, making it safe to load next to a JRE in the same process.

The Hotspot JVM implementation contains signal handling and actually calling (chaining) the handlers saved by libjsig. Unfortunately the default action handling branch is not implemented as it instead decides to throw all unexpected signals as an UnexpectedException. However, the mask handling code is very useful for anyone else implementing chaining.

The unloading problem is not solved by libjsig - it is expected that the library will never be unloaded. You can add the anti-unloading code from the earlier part of the answer to make sure this is the case.

CLR

I did not review this in depth because it has the most complicated handling.

The CLR implements SEH exceptions (the Windows exception handling model) on top of POSIX signals, and a single level of chaining similar to the JRE. It might be possible to register your own SEH unwinding and exception handling information for your ranges of code, so if you don't mind pulling in a CLR dependency, this might be worth looking into.

Structured Exception Handling is the standard exception handling method on Windows, which specifies an unwinding information format. When a hardware exception is received, the stack is unwound based on the provided information, language specific handlers associated to the code ranges of every return address are invoked, and they may decide to handle an exception or not. This means exceptions (signals) are "resources" belonging to whatever code causes them (as long as a lower frame doesn't accidentally catch it due to a badly written filter function), unlike the *nix way where they're process-global. In my personal opinion this is a much more sensible approach.



Related Topics



Leave a reply



Submit