How to Find Out What Linux Capabilities a Process Requires to Work

How to find out what linux capabilities a process requires to work?

Based on recent libcap2 update

1: (Short option): getpcaps

Description:

From here:

getpcaps displays the capabilities on the processes indicated by the
pid value(s) given on the command line.

Example:

$ getpcaps <PID>
PID: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+i

2: (A bit longer option): /proc status and capsh

Description:

proc is a process information pseudo-filesystem or in other words - a directory where you can view information on all processes.

About capsh:

Linux capability support and use can be explored and constrained with
this tool. This tool provides a handy wrapper for certain types of
capability testing and environment creation.
It also provides
some debugging features useful for summarizing capability state.

Example:

$ cat /proc/<PID>/status | grep Cap

And you'll get (on most systems):

CapInh: 00000000a80425fb (Inherited capabilities)
CapPrm: 0000000000000000 (Permitted capabilities)
CapEff: 0000000000000000 (Effective capabilities)
CapBnd: 00000000a80425fb (Bounding set)
CapAmb: 000000000000000  (Ambient capabilities set)

Use the capsh utility to decode from hexadecimal numbers into the capabilities name:

capsh --decode=00000000a80425fb
0x00000000a80425fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap

(*) You can download capsh with: sudo apt-get install git libpcap-dev.

How to compute the minimal capabilities' set for a process?

Two possible approaches to determine required capabilities at runtime:

Subsequently run your program under strace without root privileges. Determine which system calls failed with EPERM and add corresponding capabilities to your program. Repeat this until all capabilities are gathered.
Use SystemTap, DTrace or Kprobes to log or
intercept capability checks in kernel made for your program. (e.g. use capable from BCC tools suite as described here)

Unit tests with good coverage will help a lot, I guess. Also note that capabilities(7) manual page lists system calls that may require each capability (although it is not a complete list).

Update:

The article referenced by @RodrigoBelem mentions capable_probe module, which is based on KProbes.

Original article with this module was "POSIX file capabilities: Parceling the power of root" and it's not availble now (it was hosted here). But you can find the source code and some docs in the Internet.

Request Linux Capabilities During Runtime

The most common method to do provide extra capabilities to a process is to assign filesystem capabilities to its binary.

For example, if you want the processes executing /sbin/yourprog to have the CAP_CHOWN capability, add that capability to the permitted and effective sets of that file: sudo setcap cap_chown=ep /sbin/yourprog.

The setcap utility is provided by the libcap2-bin package, and is installed by default on most Linux distributions.

It is also possible to provide the capabilities to the original process, and have that process manipulate its effective capability set as needed. For example, Wireshark's dumpcap is typically installed with CAP_NET_ADMIN and CAP_NET_RAW filesystem capabilities in the effective, permitted, and inheritable sets.

I dislike the idea of adding any filesystem capabilities to the inheritable set. When the capabilities are not in the inheritable set, executing another binary causes the kernel to drop those capabilities (assuming KEEPCAPS is zero; see prctl(PR_SET_KEEPCAPS) and man 7 capabilities for details).

As an example, if you granted /sbin/yourprog only the CAP_CHOWN capability and only in the permitted set (sudo setcap cap_chown=p /sbin/yourprog), then the CAP_CHOWN capability will not be automatically effective, and it will be dropped if the process executes some other binary. To use the CAP_CHOWN capability, a thread can add the capability to its effective set for the duration of the operations needed, then remove it from the effective set (but keep it in the permitted set), via prctl() calls. Note that the libcap cap_get_proc()/cap_set_proc() interface applies the changes to all threads in the process, which may not be what you want.

For temporarily granting a capability, a worker sub-process can be used. This makes sense for a complex process, as it allows delegating/separating the privileged operations to a separate binary. A child process is forked, connected to the parent via an Unix domain stream or datagram socket created via socketpair(), and executes the helper binary that grants it the necessary capabilities. It then uses the Unix domain stream socket to verify the identity (process ID, user ID, group ID, and via the process ID, the executable the other end of the socket is executing). The reason a pipe is not used, is that an Unix domain stream socket or datagram socketpair socket is needed to use the SO_PEERCRED socket option to query the kernel the identity of the other end of the socket.

There are known attack patterns that need to be anticipated and thwarted. The most common attack pattern is causing the parent process to immediately execute a compromised binary after forking and executing the privileged child process, timed just right so the capabled child process trusts the other end is its proper parent executing the proper binary, but in fact control has been transferred to a completely different, compromised or untrustworthy binary.

The details on exactly how to do this securely are a software engineering question much more than a programming question, but using socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, fdpair) and verifying the socket peer is the parent process still executing the expected binary more than just once at the beginning, are the key steps needed.

The simplest example I can think of is using prctl() and CAP_NET_BIND_SERVICE filesystem capability only in the permitted set, so that an otherwise unprivileged process can use a privileged port (1-1024, preferably a system-wide subset defined/listed in a root or admin-owned configuration file somewhere under /etc) to provide a network service. If the service will close and reopen its listening socket when told to do so (perhaps via SIGUSR1 signal), the listening socket cannot simply be created once at the beginning then dropped. It is a pretty good match for the "keep in permitted set, but only add to effective set of the thread that actually needs it, then drop it immediately afterwards" pattern.

For CAP_CHOWN, an example program might acquire it into its effective and permitted sets via the filesystem capability, but use a trusted configuration file (root/admin modifiable only) to list the ownership changes it is allowed to do based on the real user and group identity running the process. Consider a dedicated "sudo"-style "chown" utility, intended for say organizations to allow team leads to shift file ownership between their team members, but one that does not use sudo.)

Test for linux CAP_FOWNER capability in C?

I think that capable() is available within the kernel sources, but not for general use. If you are writing a device driver or module then it should be available.

If you are writing a user space program, then you might be able to use functions provided by libcap; see man capabilities(7) and man libcap(3). I'd suggest #include <sys/capability.h> and use cap_get_proc() and possibly CAP_IS_SUPPORTED(CAP_FOWNER).

If that's no good the obvious workaround is to attempt chmod() on the directory and handle possible failure.

Linux capabilities to launch process as root from a user mode program in C++

Try this (parent.cc):

#include <iostream>
#include <sys/capability.h>
#include <unistd.h>

int main() {
    cap_t caps = cap_get_proc();
    cap_value_t val = CAP_SETUID;
    cap_set_flag(caps, CAP_EFFECTIVE, 1, &val, CAP_SET);
    if (cap_set_proc(caps)) {
        perror("failed to raise cap_setuid");
        exit(1);
    }
    if (setuid(0)) {
        perror("unable to setuid");
        exit(1);
    }
    execl("./child.sh", "child.sh", NULL);
    std::cout << "didn't work, uid=" << getuid();
    exit(1);
}

With this (child.sh):

#!/bin/bash
id -u

Build and set things up:

$ chmod +x child.sh
$ g++ -o parent parent.cc -lcap
$ sudo setcap cap_setuid=p ./parent

If you run ./parent it should work like this:

$ ./parent 
0

This example is single threaded. If your app is single threaded, it should be sufficient. If your program is multithreaded, you might need to explore something like libpsx.

How to execve a process, retaining capabilities in spite of missing filesystem-based capabilities?

I am not saying that I recommend this for what you are doing, but here it is.

Extracted from the manual, There have been some changes. According to it: fork does not change capabilities. And now there is an ambient set added in Linux kernel 4.3, it seems that this is for what you are trying to do.

   Ambient (since Linux 4.3):
          This is a set of capabilities that are preserved across an execve(2) of a program that is not privileged.  The ambient capability set obeys the invariant that no capability can ever
          be ambient if it is not both permitted and inheritable.

          The ambient capability set can be directly modified using
          prctl(2).  Ambient capabilities are automatically lowered if
          either of the corresponding permitted or inheritable
          capabilities is lowered.

          Executing a program that changes UID or GID due to the set-
          user-ID or set-group-ID bits or executing a program that has
          any file capabilities set will clear the ambient set.  Ambient
          capabilities are added to the permitted set and assigned to
          the effective set when execve(2) is called.

   A child created via fork(2) inherits copies of its parent's
   capability sets.  See below for a discussion of the treatment of
   capabilities during execve(2).

Transformation of capabilities during execve()
   During an execve(2), the kernel calculates the new capabilities of
   the process using the following algorithm:

       P'(ambient) = (file is privileged) ? 0 : P(ambient)

       P'(permitted) = (P(inheritable) & F(inheritable)) |
                       (F(permitted) & cap_bset) | P'(ambient)

       P'(effective) = F(effective) ? P'(permitted) : P'(ambient)

       P'(inheritable) = P(inheritable)    [i.e., unchanged]

   where:

       P         denotes the value of a thread capability set before the
                 execve(2)

       P'        denotes the value of a thread capability set after the
                 execve(2)

       F         denotes a file capability set

       cap_bset  is the value of the capability bounding set (described
                 below).

   A privileged file is one that has capabilities or has the set-user-ID
   or set-group-ID bit set.

How to Find Out What Linux Capabilities a Process Requires to Work