Use perf inside a docker container without --privileged
After some research, the problem is not with the perf_event_paranoid
, but with the fact that perf_event_open
(syscall) has been blacklisted in docker:
https://docs.docker.com/engine/security/seccomp/ "Docker v17.06: Seccomp security profiles for Docker"
Significant syscalls blocked by the default profile
perf_event_open
Tracing/profiling syscall, which could leak a lot of information on the host.
My first work-around for this is to have a script that downloads the official seccomp file https://github.com/moby/moby/blob/master/profiles/seccomp/default.json, and adds perf_event_open
to the list of white-listed syscalls.
I then start docker with --security-opt seccomp=my-seccomp.json
Can I run Docker-in-Docker without using the --privileged flag
Unfortunately no, you must use the --privileged
flag to run Docker in Docker, you can take a look at the official announcement where they state this is one of the many purposes of the --privileged
flag.
Basically, you need more access to the host system devices to run docker than you get when running without --privileged
.
Docker Alpine and perf not getting along in docker container
The problem is that Docker by default blocks a list of system calls, including perf_event_open, which perf relies heavily on.
Official docker reference: https://docs.docker.com/engine/security/seccomp/
Solution:
- Download the standard seccomp(secure compute) file for docker. It's a json file.
- Find "perf_event_open", it only appears once, and delete it.
Add a new entry in syscalls section:
{ "names": [ "perf_event_open" ], "action": "SCMP_ACT_ALLOW" },
Add the following to your command to run the container:
--security-opt seccomp=path/to/default.json
That did it for me.
How to use perf tool with docker running stress-ng?
Carrying on from comments by @osgx,
As is mentioned here, by default, the perf stat
command will monitor not only all the threads of the process to be monitored, but also its child processes and threads.
The problem in this situation is that by running perf stat
and monitoring the docker run stress-ng
command, you are not monitoring the actual stress-ng
process. It is important to note that, the processes running as part of the container, will actually not be started by the docker
client, but rather by the docker-containerd-shim
process (which is a grandchild process of the dockerd
process).
If you run the docker command to run stress-ng
inside the container and observe the process-tree, it becomes evident.
docker run -ti --name=stress-ng --rm polinux/stress-ng --cpu 2 --timeout 100
ps -elf | grep docker
0 S ubuntu 26379 114001 0 80 0 - 119787 futex_ 12:33 pts/3 00:00:00 docker run -ti --name=stress-ng --rm polinux/stress-ng --cpu 2 --timeout 10000
4 S root 26431 118477 0 80 0 - 2227 - 12:33 ? 00:00:00 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/72a8c2787390669ff4eeae6f343ab4f9f60434f39aae66b1a778e78b7e5e45d8 -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
0 S ubuntu 26610 26592 0 80 0 - 3236 pipe_w 12:34 pts/6 00:00:00 grep --color=auto docker
4 S root 118453 1 3 80 0 - 283916 - May02 ? 01:01:57 /usr/bin/dockerd -H fd://
4 S root 118477 118453 4 80 0 - 457853 - May02 ? 01:14:36 docker-containerd --config /var/run/docker/containerd/containerd.toml
----------------------------------------------------------------------
ps -elf | grep stress-ng
0 S ubuntu 26379 114001 0 80 0 - 119787 futex_ 12:33 pts/3 00:00:00 docker run -ti --name=stress-ng --rm polinux/stress-ng --cpu 2 --timeout 10000
4 S root 26455 26431 0 80 0 - 16621 - 12:33 pts/0 00:00:00 /usr/bin/stress-ng --cpu 2 --timeout 10000
1 R root 26517 26455 99 80 0 - 16781 - 12:33 pts/0 00:01:08 /usr/bin/stress-ng --cpu 2 --timeout 10000
1 R root 26518 26455 99 80 0 - 16781 - 12:33 pts/0 00:01:08 /usr/bin/stress-ng --cpu 2 --timeout 10000
0 S ubuntu 26645 26592 0 80 0 - 3236 pipe_w 12:35 pts/6 00:00:00 grep --color=auto stress-ng
The PPID of the first stress-ng
process is 26431, which is not the docker run
command, but actually the docker-containerd-shim
process. Monitoring the docker run
command will never reflect correct values, because the docker
client is completely detached from the process of starting the stress-ng
commands.
- One way to get around this problem would be to attach the
perf stat
command to the PIDs of the stress-ng processes that are started by the docker runtime.
eg, as in the above case, once the docker run
command is started, you can immediately start doing this -
perf stat -p 26455,26517,26518
Performance counter stats for process id '26455,26517,26518':
148171.516145 task-clock (msec) # 1.939 CPUs utilized
49 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
67 page-faults # 0.000 K/sec
You may increase the --timeout
a little bit so that the command runs longer, since you are now starting perf stat
post starting stress-ng
. Also you have to account for a small fraction of the initial measuring time lost.
- The other way would be to run
perf stat
inside the docker container, something like adocker run perf stat ...
, but for that you would have to start providingprivileges
to your container, since, by default, theperf_event_open
system call is blacklisted indocker
. You can read this answer here.
Slow performance using /dev/random in docker desktop WSL2
Before applying any of these solutions, check if missing of entropy is your real problem ... To do that execute these commands (in your docker host and in your container):
cat /proc/sys/kernel/random/entropy_avail
It should return a number greater that 1000
...
dd if=/dev/random of=/dev/null bs=1024 count=1 iflag=fullblock
It should return fast! (Sources: haveged and rng-tools)
Solutions:
For Windows Users (those of you that run DockerDestop for Windows):
- Keep using the WSL1 engine with Docker Desktop.
- If the previous solution is not possible, execute this:
docker pull harbur/haveged
docker run --privileged -d harbur/haveged
Explanation: This will run a docker container that executes the haveged daemon/process as CMD. Such process, plus --privileged
flag, will feed your host /dev/random
with entropy, avoiding blocking issues.
For Linux users (those running Linux as docker host):
Map as a volume/mount-point your host's
/dev/urandom
to your container's/dev/random
. This will trick your container, and when it use/dev/random
, it will be using your host's/dev/urandom
, which never blocks by design. People may argue that's insecure, but that is out the scope of this question.Install in your docker host, a software that increments the entropy pool, like haveged or rng-tools (if you have a hardware TRNG)
Final thoughts and conclusions:
/dev/random
and/dev/urandom
in a docker container point to/dev/random
and/dev/urandom
of the docker host. I don't have any documentation that backups this, except these: Missing Entropy and How docker handles /dev/(u)random request ... and the experimental fact that if I access the WSL2 docker-desktop-distro (usingwsl -d docker-desktop
) and I execute thedd
command described previously, I can see how the entropy is reduced both in the host and the container (and viceversa) ... This is why using solutions, like deploying the haveged container or installing haveged in the docker host, work.According to haveged link, such software is deprecated because its logic is now included in linux kernels v5.6 ... This could mean that if your docker host is running a Linux Kernel equals or greater to the version 5.6, you won't need to do anything of this because
/dev/random
will never block.I tried to install haveged in the WSL2 docker distro (
docker-desktop
), but such distro does not allow you to executeapt-get
...
Related Topics
Importing Shapefiles in Postgresql in Linux Using Pgadmin 4
Allocate Writable Memory in the .Text Section
Selecting the Right Linux I/O Scheduler for a Host Equipped with Nvme Ssd
Fast Concatenate Multiple Files on Linux
How to Convert Spaces to Tabs in Vim or Linux
How to List the Contents of a Package Using Yum
Google-Chrome Failed to Move to New Namespace
Add User to Group But Not Reflected When Run "Id"
Does Gcc, Icc, or Microsoft's C/C++ Compiler Support or Know Anything About Numa
How to Sleep in the Linux Kernel Space
Code Snippet Managers for Linux Desktops
Spidev Linux Driver on Intel Atom Board
Tilde Expansion in Environment Variable
Linux Perf Reporting Cache Misses for Unexpected Instruction
Creating a System Call in Linux
How to Grep Download Speed from Wget Output
Use Ssh to Start a Background Process on a Remote Server, and Exit Session
Command Not Found via Ssh with Single Command, Found After Connecting to Terminal