How to set CAP_SYS_NICE capability to a Linux user?
Jan Hudec is right that a process can't just give itself a capability, and a setuid wrapper is the obvious way get the capability. Also, keep in mind that you'll need to prctl(PR_SET_KEEPCAPS, ...)
when you drop root. (See the prctl
man page for details.) Otherwise, you'll drop the capability when you transition to your non-root real user id.
If you really just want to launch user sessions with a different allowed nice level, you might see the pam_limits
and limits.conf
man pages, as the pam_limits
module allows you to change the hard nice limit. It could be a line like:
yourspecialusername hard nice -10
Drop root UID while retaining CAP_SYS_NICE
Edited to describe the reason for the original failure:
There are three sets of capabilities in Linux: inheritable, permitted, and effective. Inheritable defines which capabilities stay permitted across an exec()
. Permitted defines which capabilities are permitted for a process. Effective defines which capabilities are currently in effect.
When changing the owner or group of a process from root to non-root, the effective capability set is always cleared.
By default, also the permitted capability set is cleared, but calling prctl(PR_SET_KEEPCAPS, 1L)
before the identity change tells the kernel to keep the permitted set intact.
After the process has changed the identity back to the unprivileged user, the CAP_SYS_NICE
must be added to the effective set. (It must also be set in the permitted set, so if you clear your capability set, remember to set it also. If you just modify the current capability set, then you know it is already set because you inherited it.)
Here is the procedure I recommend you should follow:
Save real user ID, real group ID, and supplemental group IDs:
#define _GNU_SOURCE
#define _BSD_SOURCE
#include <unistd.h>
#include <sys/types.h>
#include <sys/capability.h>
#include <sys/prctl.h>
#include <grp.h>
uid_t user = getuid();
gid_t group = getgid();
gid_t *gid;
int gids, n;
gids = getgroups(0, NULL);
if (gids < 0) /* error */
gid = malloc((gids + 1) * sizeof *gid);
if (!gid) /* error */
gids = getgroups(gids, gid);
if (gids < 0) /* error */Filter out unnecessary and privileged supplementary groups (be paranoid!)
n = 0;
while (n < gids)
if (gid[n] == 0 || gid[n] == group)
gid[n] = gid[--gids];
else
n++;Because you cannot "clear" the supplementary group IDs (that just requests the current number), make sure the list is never empty. You can always add the real group ID to the supplementary list to make it non-empty.
if (gids < 1) {
gid[0] = group;
gids = 1;
}Switch real and effective user IDs to root
if (setresuid(0, 0, 0)) /* error */
Set the
CAP_SYS_NICE
capability in theCAP_PERMITTED
set.
I prefer to clear the entire set, and only keep the four capabilities that are required for this approach to work (and later on, drop all but CAP_SYS_NICE):cap_value_t capability[4] = { CAP_SYS_NICE, CAP_SETUID, CAP_SETGID, CAP_SETPCAP };
cap_t capabilities;
capabilities = cap_get_proc();
if (cap_clear(capabilities)) /* error */
if (cap_set_flag(capabilities, CAP_EFFECTIVE, 4, capability, CAP_SET)) /* error */
if (cap_set_flag(capabilities, CAP_PERMITTED, 4, capability, CAP_SET)) /* error */
if (cap_set_proc(capabilities)) /* error */Tell the kernel you wish to retain the capabilities over the change from root to the unprivileged user; by default, the capabilities are cleared to zero when changing from root to non-root identity
if (prctl(PR_SET_KEEPCAPS, 1L)) /* error */
Set real, effective, and saved group IDs to the initially saved group ID
if (setresgid(group, group, group)) /* error */
Set supplemental group IDs
if (setgroups(gids, gid)) /* error */
Set real, effective and saved user IDs to the initially saved user ID
if (setresuid(user, user, user)) /* error */
At this point you effectively drop root privileges (without the ability to gain them back anymore), except for the
CAP_SYS_NICE
capability. Due to the transition from root to non-root user, the capability is never effective; the kernel will always clear the effective capability set on such a transition.Set the
CAP_SYS_NICE
capability in theCAP_PERMITTED
andCAP_EFFECTIVE
setif (cap_clear(capabilities)) /* error */
if (cap_set_flag(capabilities, CAP_PERMITTED, 1, capability, CAP_SET)) /* error */
if (cap_set_flag(capabilities, CAP_EFFECTIVE, 1, capability, CAP_SET)) /* error */
if (cap_set_flag(capabilities, CAP_PERMITTED, 3, capability + 1, CAP_CLEAR)) /* error */
if (cap_set_flag(capabilities, CAP_EFFECTIVE, 3, capability + 1, CAP_CLEAR)) /* error */
if (cap_set_proc(capabilities)) /* error */Note that the latter two
cap_set_flag()
operations clear the three capabilities no longer needed, so that only the first one,CAP_SYS_NICE
remains.At this point the capabilities' descriptor is no longer needed, so it's a good idea to free it.
if (cap_free(capabilities)) /* error */
Tell the kernel you don't wish to retain the capability over any further changes from root (again, just paranoia)
if (prctl(PR_SET_KEEPCAPS, 0L)) /* error */
This works on x86-64 using GCC-4.6.3, libc6-2.15.0ubuntu10.3, and linux-3.5.0-18 kernel on Xubuntu 12.04.1 LTS, after installing the libcap-dev
package.
Edited to add:
You can simplify the process by relying only on the effective user ID being root, as the executable is setuid root. In that case, you don't need to worry about the supplementary groups either, as the setuid root only affects the effective user ID and nothing else. Returning back to the original real user, you technically only need the one setresuid()
call at the end of the procedure (and the setresgid()
if the executable also happens to be marked setgid root), to set both saved and effective user (and group) IDs to the real user.
However, the case where you regain the original users' identity is rare, and the case where you gain the identity of a named user is common, and this procedure here was originally designed for the latter. You would use initgroups()
to gain the correct supplementary groups for the named user, and so on. In that case, taking care of the real, effective, and saved user and group IDs and supplementary group IDs this carefully is important, as otherwise the process would inherit supplementary groups from the user that executed the process.
The procedure here is paranoid, but paranoia is not a bad thing when you are dealing with security-sensitive issues. For the revert-back-to-real-user case, it can be simplified.
Edited on 2013-03-17 to show a simple test program. This assumes it is installed setuid root, but it will drop all privileges and capabilities (except CAP_SYS_NICE, which is required for scheduler manipulation above the normal rules). I pared down the "excess" operations I prefer to do, in the hopes that others find this easier to read.
#define _GNU_SOURCE
#define _BSD_SOURCE
#include <unistd.h>
#include <sys/types.h>
#include <sys/capability.h>
#include <sys/prctl.h>
#include <grp.h>
#include <errno.h>
#include <string.h>
#include <sched.h>
#include <stdio.h>
void test_priority(const char *const name, const int policy)
{
const pid_t me = getpid();
struct sched_param param;
param.sched_priority = sched_get_priority_max(policy);
printf("sched_get_priority_max(%s) = %d\n", name, param.sched_priority);
if (sched_setscheduler(me, policy, ¶m) == -1)
printf("sched_setscheduler(getpid(), %s, { %d }): %s.\n", name, param.sched_priority, strerror(errno));
else
printf("sched_setscheduler(getpid(), %s, { %d }): Ok.\n", name, param.sched_priority);
param.sched_priority = sched_get_priority_min(policy);
printf("sched_get_priority_min(%s) = %d\n", name, param.sched_priority);
if (sched_setscheduler(me, policy, ¶m) == -1)
printf("sched_setscheduler(getpid(), %s, { %d }): %s.\n", name, param.sched_priority, strerror(errno));
else
printf("sched_setscheduler(getpid(), %s, { %d }): Ok.\n", name, param.sched_priority);
}
int main(void)
{
uid_t user;
cap_value_t root_caps[2] = { CAP_SYS_NICE, CAP_SETUID };
cap_value_t user_caps[1] = { CAP_SYS_NICE };
cap_t capabilities;
/* Get real user ID. */
user = getuid();
/* Get full root privileges. Normally being effectively root
* (see man 7 credentials, User and Group Identifiers, for explanation
* for effective versus real identity) is enough, but some security
* modules restrict actions by processes that are only effectively root.
* To make sure we don't hit those problems, we switch to root fully. */
if (setresuid(0, 0, 0)) {
fprintf(stderr, "Cannot switch to root: %s.\n", strerror(errno));
return 1;
}
/* Create an empty set of capabilities. */
capabilities = cap_init();
/* Capabilities have three subsets:
* INHERITABLE: Capabilities permitted after an execv()
* EFFECTIVE: Currently effective capabilities
* PERMITTED: Limiting set for the two above.
* See man 7 capabilities for details, Thread Capability Sets.
*
* We need the following capabilities:
* CAP_SYS_NICE For nice(2), setpriority(2),
* sched_setscheduler(2), sched_setparam(2),
* sched_setaffinity(2), etc.
* CAP_SETUID For setuid(), setresuid()
* in the last two subsets. We do not need to retain any capabilities
* over an exec().
*/
if (cap_set_flag(capabilities, CAP_PERMITTED, sizeof root_caps / sizeof root_caps[0], root_caps, CAP_SET) ||
cap_set_flag(capabilities, CAP_EFFECTIVE, sizeof root_caps / sizeof root_caps[0], root_caps, CAP_SET)) {
fprintf(stderr, "Cannot manipulate capability data structure as root: %s.\n", strerror(errno));
return 1;
}
/* Above, we just manipulated the data structure describing the flags,
* not the capabilities themselves. So, set those capabilities now. */
if (cap_set_proc(capabilities)) {
fprintf(stderr, "Cannot set capabilities as root: %s.\n", strerror(errno));
return 1;
}
/* We wish to retain the capabilities across the identity change,
* so we need to tell the kernel. */
if (prctl(PR_SET_KEEPCAPS, 1L)) {
fprintf(stderr, "Cannot keep capabilities after dropping privileges: %s.\n", strerror(errno));
return 1;
}
/* Drop extra privileges (aside from capabilities) by switching
* to the original real user. */
if (setresuid(user, user, user)) {
fprintf(stderr, "Cannot drop root privileges: %s.\n", strerror(errno));
return 1;
}
/* We can still switch to a different user due to having the CAP_SETUID
* capability. Let's clear the capability set, except for the CAP_SYS_NICE
* in the permitted and effective sets. */
if (cap_clear(capabilities)) {
fprintf(stderr, "Cannot clear capability data structure: %s.\n", strerror(errno));
return 1;
}
if (cap_set_flag(capabilities, CAP_PERMITTED, sizeof user_caps / sizeof user_caps[0], user_caps, CAP_SET) ||
cap_set_flag(capabilities, CAP_EFFECTIVE, sizeof user_caps / sizeof user_caps[0], user_caps, CAP_SET)) {
fprintf(stderr, "Cannot manipulate capability data structure as user: %s.\n", strerror(errno));
return 1;
}
/* Apply modified capabilities. */
if (cap_set_proc(capabilities)) {
fprintf(stderr, "Cannot set capabilities as user: %s.\n", strerror(errno));
return 1;
}
/*
* Now we have just the normal user privileges,
* plus user_caps.
*/
test_priority("SCHED_OTHER", SCHED_OTHER);
test_priority("SCHED_BATCH", SCHED_BATCH);
test_priority("SCHED_IDLE", SCHED_IDLE);
test_priority("SCHED_FIFO", SCHED_FIFO);
test_priority("SCHED_RR", SCHED_RR);
return 0;
}
Note that if you know the binary is only run on relatively recent Linux kernels, you can rely on file capabilities. Then, your main()
needs none of the identity or capability manipulation -- you can remove everything in main()
except the test_priority()
functions --, and you just give your binary, say ./testprio
, the CAP_SYS_NICE priority:
sudo setcap 'cap_sys_nice=pe' ./testprio
You can run getcap
to see which priorities are granted when a binary is executed:
getcap ./testprio
which should display
./testprio = cap_sys_nice+ep
File capabilities seem to be little used thus far. On my own system, gnome-keyring-daemon
is the only one with file capabilities (CAP_IPC_LOCK, for locking memory).
How to compute the minimal capabilities' set for a process?
Two possible approaches to determine required capabilities at runtime:
- Subsequently run your program under
strace
without root privileges. Determine which system calls failed withEPERM
and add corresponding capabilities to your program. Repeat this until all capabilities are gathered. - Use
SystemTap
,DTrace
orKprobes
to log or
intercept capability checks in kernel made for your program. (e.g. usecapable
from BCC tools suite as described here)
Unit tests with good coverage will help a lot, I guess. Also note that capabilities(7) manual page lists system calls that may require each capability (although it is not a complete list).
Update:
The article referenced by @RodrigoBelem mentions capable_probe
module, which is based on KProbes
.
Original article with this module was "POSIX file capabilities: Parceling the power of root" and it's not availble now (it was hosted here). But you can find the source code and some docs in the Internet.
Linux capabilities (setcap) seems to disable LD_LIBRARY_PATH
Yes, it's disabled for security reasons.
Linux: setting process priority AND dynamically loading libraries
I have two solutions which do not involve modifying libc
. Both solutions require us to replace the calls to sched_setscheduler()
with a call to launch another process directly.
Install a file to
/etc/sudoers.d/
with the following line:%users ALL=NOPASSWD: /usr/bin/chrt
Then from our application launch
sudo
as a process with argumentschrt -f -p X Y
whereX
is the configured priority andY
is the result ofgetpid()
.Create a custom
chrt
with:cp $(which chrt) $(DESTDIR)/bin/chrt
sudo setcap cap_sys_nice+ep $(DESTDIR)/bin/chrt
sudo chmod 755 $(DESTDIR)/bin/chrtThen from our application launch
chrt
as a process with arguments-f -p X Y
Not sure which solution is better. Note this is effectively embedded (or at least purpose built) so I'm not too worried about the security exposure.
Related Topics
How to Pipe Output to a File When Running as a Systemd Service
Print Kernel's Page Table Entries
How to Read a Sector Using a Bio Request in Linux Kernel
How to Prevent Out of Memory (Oom) Freezes on Linux
How to Programmatically Set a Permanent Environment Variable in Linux
How to Get in Script Whether Valgrind Found Memory Leaks
Configure and Build Opencv to Custom Ffmpeg Install
Cannot Sudo Su Anymore, "No Tty Present and No Askpass Program Specified"
Detect Underlying Platform/Flavour in Cmake
What Happens When a Signal Is Received While Already in a Signal Handler
Bash Scripting - Read Single Keystroke Including Special Keys Enter and Space
Alsa Cannot Set Sample Format[Ffmpeg]
How to Print Message to Stderr in Go
What Makes the Gcc Std::List Sort Implementation So Fast
How to Make Ffmpeg Write Its Output to a Named Pipe
How to Detect Usb Device Disconnect Under Linux/Qt/C++
What Is This $Path in Linux and How to Modify It
Effects of Removing All Symbol Table and Relocation Information from an Executable