High Availability Computing: How to Deal with a Non-Returning System Call, Without Risking False Positives

High availability computing: How to deal with a non-returning system call, without risking false positives?

My second suggestion is to use ptrace to find the current instruction pointer. You can have a parent thread that ptraces your process and interrupts it every second to check the current RIP value. This is somewhat complex, so I've written a demonstration program: (x86_64 only, but that should be fixable by changing the register names.)

#define _GNU_SOURCE
#include <unistd.h>
#include <sched.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/syscall.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <linux/ptrace.h>
#include <sys/user.h>
#include <time.h>

// this number is arbitrary - find a better one.
#define STACK_SIZE (1024 * 1024)

int main_thread(void *ptr) {
    // "main" thread is now running under the monitor
    printf("Hello from main!");
    while (1) {
        int c = getchar();
        if (c == EOF) { break; }
        nanosleep(&(struct timespec) {0, 200 * 1000 * 1000}, NULL);
        putchar(c);
    }
    return 0;
}

int main(int argc, char *argv[]) {
    void *vstack = malloc(STACK_SIZE);
    pid_t v;
    if (clone(main_thread, vstack + STACK_SIZE, CLONE_PARENT_SETTID | CLONE_FILES | CLONE_FS | CLONE_IO, NULL, &v) == -1) { // you'll want to check these flags
        perror("failed to spawn child task");
        return 3;
    }
    printf("Target: %d; %d\n", v, getpid());
    long ptv = ptrace(PTRACE_SEIZE, v, NULL, NULL);
    if (ptv == -1) {
        perror("failed monitor sieze");
        exit(1);
    }
    struct user_regs_struct regs;
    fprintf(stderr, "beginning monitor...\n");
    while (1) {
        sleep(1);
        long ptv = ptrace(PTRACE_INTERRUPT, v, NULL, NULL);
        if (ptv == -1) {
            perror("failed to interrupt main thread");
            break;
        }
        int status;
        if (waitpid(v, &status, __WCLONE) == -1) {
            perror("target wait failed");
            break;
        }
        if (!WIFSTOPPED(status)) { // this section is messy. do it better.
            fputs("target wait went wrong", stderr);
            break;
        }
        if ((status >> 8) != (SIGTRAP | PTRACE_EVENT_STOP << 8)) {
            fputs("target wait went wrong (2)", stderr);
            break;
        }
        ptv = ptrace(PTRACE_GETREGS, v, NULL, ®s);
        if (ptv == -1) {
            perror("failed to peek at registers of thread");
            break;
        }
        fprintf(stderr, "%d -> RIP %x RSP %x\n", time(NULL), regs.rip, regs.rsp);
        ptv = ptrace(PTRACE_CONT, v, NULL, NULL);
        if (ptv == -1) {
            perror("failed to resume main thread");
            break;
        }
    }
    return 2;
}

Note that this is not production-quality code. You'll need to do a bunch of fixing things up.

Based on this, you should be able to figure out whether or not the program counter is advancing, and could combine this with other pieces of information (such as /proc/PID/status) to find if it's busy in a system call. You might also be able to extend the usage of ptrace to check what system calls are being used, so that you can check if it's a reasonable one to be waiting on.

This is a hacky solution, but I don't think that you'll find a non-hacky solution for this problem. Despite the hackiness, I don't think (this is untested) that it would be particularly slow; my implementation pauses the monitored thread once per second for a very short amount of time - which I would guess would be in the 100s of microseconds range. That's around 0.01% efficiency loss, theoretically.

Is it possible to read another thread's program counter?

On Linux, field 30 of /proc/self/task/%d/stat, where %d needs to be filled in with the kernel tid of the thread in question, contains the last-observed instruction pointer value for the thread. See http://man7.org/linux/man-pages/man5/proc.5.html and note that it's documented under /proc/[pid]/stat but the version in the task directory under the current process is the one you want for targeting a thread.

The hard part may be getting the kernel tid for the thread. The easiest way to do this is to call syscall(SYS_gettid) from the thread and have it store its kernel tid somewhere.

High Availability Computing: How to Deal with a Non-Returning System Call, Without Risking False Positives

High availability computing: How to deal with a non-returning system call, without risking false positives?

Is it possible to read another thread's program counter?

Related Topics

Leave a reply