Prevent File Descriptors Inheritance During Linux Fork

Prevent file descriptors inheritance during Linux fork

No. Close them yourself, since you know which ones need to be closed.

Linux: how to mark a file descriptor as not inheritable on fork?

No. All file descriptors are inherited in fork. You can set a fd to be closed on exec, however, by using fcntl(fd, F_SETFD, FD_CLOEXEC).

Are file descriptors shared when fork()ing?

From fork(2):

  *  The child inherits copies of the parent’s set of open file  descrip-
tors. Each file descriptor in the child refers to the same open
file description (see open(2)) as the corresponding file descriptor
in the parent. This means that the two descriptors share open file
status flags, current file offset, and signal-driven I/O attributes
(see the description of F_SETOWN and F_SETSIG in fcntl(2)).

Prevent process from opening new file descriptor on Linux but allow receiving file descriptors via sockets

What you have here is exactly the use case of seccomp.

Using seccomp, you can filter syscalls in different ways. What you want to do in this situation is, right after fork(), to install a seccomp filter that disallows the use of open(2), openat(2), socket(2) (and more).
To accomplish this, you can do the following:

  1. First, create a seccomp context using seccomp_init(3) with the default behavior of SCMP_ACT_ALLOW.
  2. Then add a rule to the context using seccomp_rule_add(3) for each syscall that you want to deny. You can use SCMP_ACT_KILL to kill the process if the syscall is attempted, SCMP_ACT_ERRNO(val) to make the syscall fail returning the specified errno value, or any other action value defined in the manual page.
  3. Load the context using seccomp_load(3) to make it effective.

Before continuing, NOTE that a blacklist approach like this one is in general weaker than a whitelist approach. It allows any syscall that is not explicitly disallowed, and could result in a bypass of the filter. If you believe that the child process you want to execute could be maliciously trying to avoid the filter, or if you already know which syscalls will be needed by the children, a whitelist approach is better, and you should do the opposite of the above: create filter with the default action of SCMP_ACT_KILL and allow the needed syscalls with SCMP_ACT_ALLOW. In terms of code the difference is minimal (the whitelist is probably longer, but the steps are the same).

Here's an example of the above (I'm doing exit(-1) in case of error just for simplicity's sake):

#include <stdlib.h>
#include <seccomp.h>

static void secure(void) {
int err;
scmp_filter_ctx ctx;

int blacklist[] = {
SCMP_SYS(open),
SCMP_SYS(openat),
SCMP_SYS(creat),
SCMP_SYS(socket),
SCMP_SYS(open_by_handle_at),
// ... possibly more ...
};

// Create a new seccomp context, allowing every syscall by default.
ctx = seccomp_init(SCMP_ACT_ALLOW);
if (ctx == NULL)
exit(-1);

/* Now add a filter for each syscall that you want to disallow.
In this case, we'll use SCMP_ACT_KILL to kill the process if it
attempts to execute the specified syscall. */

for (unsigned i = 0; i < sizeof(blacklist) / sizeof(blacklist[0]); i++) {
err = seccomp_rule_add(ctx, SCMP_ACT_KILL, blacklist[i], 0);
if (err)
exit(-1);
}

// Load the context making it effective.
err = seccomp_load(ctx);
if (err)
exit(-1);
}

Now, in your program, you can call the above function to apply the seccomp filter right after the fork(), like this:

child_pid = fork();
if (child_pid == -1)
exit(-1);

if (child_pid == 0) {
secure();

// Child code here...

exit(0);
} else {
// Parent code here...
}

A few important notes on seccomp:

  • A seccomp filter, once applied, cannot be removed or altered by the process.
  • If fork(2) or clone(2) are allowed by the filter, any child processes will be constrained by the same filter.
  • If execve(2) is allowed, the existing filter will be preserved across a call to execve(2).
  • If the prctl(2) syscall is allowed, the process is able to apply further filters.


Related Topics



Leave a reply



Submit