Understanding User File Ownership in Docker: How to Avoid Changing Permissions of Linked Volumes

Understanding user file ownership in docker: how to avoid changing permissions of linked volumes

Is that correct? Can someone point me to documentation of this, I'm just conjecturing based on the above experiment.
Perhaps this is just because they both have the same numerical value on the kernel, and if I tested on a system where my home user was not id 1000 then permissions would get changed in every case?

Have a read of info coreutils 'chown invocation', that might give you a better idea of how file permissions / ownership works.

Basically, though, each file on your machine has a set of bits tacked on to it that defines its permissions and ownership. When you chown a file, you're just setting these bits.

When you chown a file to a particular user/group using the username or group name, chown will look in /etc/passwd for the username and /etc/group for the group to attempt to map the name to an ID. If the username / group name doesn't exist in those files, chown will fail.

root@dc3070f25a13:/test# touch test
root@dc3070f25a13:/test# ll
total 8
drwxr-xr-x  2 root root 4096 Oct 22 18:15 ./
drwxr-xr-x 22 root root 4096 Oct 22 18:15 ../
-rw-r--r--  1 root root    0 Oct 22 18:15 test
root@dc3070f25a13:/test# chown test:test test
chown: invalid user: 'test:test'

However, you can chown a file using IDs to whatever you want (within some upper positive integer bounds, of course), whether there is a user / group that exists with those IDs on your machine or not.

root@dc3070f25a13:/test# chown 5000:5000 test
root@dc3070f25a13:/test# ll
total 8
drwxr-xr-x  2 root root 4096 Oct 22 18:15 ./
drwxr-xr-x 22 root root 4096 Oct 22 18:15 ../
-rw-r--r--  1 5000 5000    0 Oct 22 18:15 test

The UID and GID bits are set on the file itself, so when you mount those files inside your docker container, the file has the same owner / group UID as it does on the host, but is now mapped to /etc/passwd in the container, which is probably going to be a different user unless it's owned by root (UID 0).

The real question is, of course, 'what do I do about this?' If bob is logged in as bob on the given host machine, he should be able to run the container as bob and not have file permissions altered under his host account. As it stands, he actually needs to run the container as user docker to avoid having his account altered.

It seems like, with your current set-up, you'll need to make sure your UIDs > usernames in /etc/passwd on your host match up to your UIDs > usernames in your containers /etc/passwd if you want to interact with your mounted user directory as the same user that's logged in on the host.

You can create a user with a specific user id with useradd -u xxxx. Buuuut, that does seem like a messy solution...

You might have to come up with a solution that doesn't mount a host users home directory.

Wrong OWNER USER on folder/file: docker run -v host_path_dir_file:docker_some_path_dir_file/ not working for user defined in Dockerfile

User gigauser numeric ID is not 1000, i.e. 21520. It works on another host because there, local user probably has the numeric ID 1000.

Because we're mounting the folder not copying it, When you mount it, it gets shared into the container with exactly the same permissions/IDs as set on the host - because it's on the host. Containers aren't like VMs with totally separate resources, and even on a VM if you mount something like an NFS directory you'll get numeric IDs that may or may not match your local IDs.

Using /etc/subuid requires passing a flag to the run command, and you'd have to do maths to work out the offsets for your user.

Shared volume/file permissions/ownership (Docker)

It looks like your chown -R nginx:nginx ... commands inside your container are changing the ownership bits on your files to be owned by libuuid on your host machine.

See Understanding user file ownership in docker: how to avoid changing permissions of linked volumes for a basic explanation on how file ownership bits work between your host and your docker containers.

What is the (best) way to manage permissions for Docker shared volumes?

UPDATE 2016-03-02: As of Docker 1.9.0, Docker has named volumes which replace data-only containers. The answer below, as well as my linked blog post, still has value in the sense of how to think about data inside docker but consider using named volumes to implement the pattern described below rather than data containers.

I believe the canonical way to solve this is by using data-only containers. With this approach, all access to the volume data is via containers that use -volumes-from the data container, so the host uid/gid doesn't matter.

For example, one use case given in the documentation is backing up a data volume. To do this another container is used to do the backup via tar, and it too uses -volumes-from in order to mount the volume. So I think the key point to grok is: rather than thinking about how to get access to the data on the host with the proper permissions, think about how to do whatever you need -- backups, browsing, etc. -- via another container. The containers themselves need to use consistent uid/gids, but they don't need to map to anything on the host, thereby remaining portable.

This is relatively new for me as well but if you have a particular use case feel free to comment and I'll try to expand on the answer.

UPDATE: For the given use case in the comments, you might have an image some/graphite to run graphite, and an image some/graphitedata as the data container. So, ignoring ports and such, the Dockerfile of image some/graphitedata is something like:

FROM debian:jessie
# add our user and group first to make sure their IDs get assigned consistently, regardless of other deps added later
RUN groupadd -r graphite \
  && useradd -r -g graphite graphite
RUN mkdir -p /data/graphite \
  && chown -R graphite:graphite /data/graphite
VOLUME /data/graphite
USER graphite
CMD ["echo", "Data container for graphite"]

Build and create the data container:

docker build -t some/graphitedata Dockerfile
docker run --name graphitedata some/graphitedata

The some/graphite Dockerfile should also get the same uid/gids, therefore it might look something like this:

FROM debian:jessie
# add our user and group first to make sure their IDs get assigned consistently, regardless of other deps added later
RUN groupadd -r graphite \
  && useradd -r -g graphite graphite
# ... graphite installation ...
VOLUME /data/graphite
USER graphite
CMD ["/bin/graphite"]

And it would be run as follows:

docker run --volumes-from=graphitedata some/graphite

Ok, now that gives us our graphite container and associated data-only container with the correct user/group (note you could re-use the some/graphite container for the data container as well, overriding the entrypoing/cmd when running it, but having them as separate images IMO is clearer).

Now, lets say you want to edit something in the data folder. So rather than bind mounting the volume to the host and editing it there, create a new container to do that job. Lets call it some/graphitetools. Lets also create the appropriate user/group, just like the some/graphite image.

FROM debian:jessie
# add our user and group first to make sure their IDs get assigned consistently, regardless of other deps added later
RUN groupadd -r graphite \
  && useradd -r -g graphite graphite
VOLUME /data/graphite
USER graphite
CMD ["/bin/bash"]

You could make this DRY by inheriting from some/graphite or some/graphitedata in the Dockerfile, or instead of creating a new image just re-use one of the existing ones (overriding entrypoint/cmd as necessary).

Now, you simply run:

docker run -ti --rm --volumes-from=graphitedata some/graphitetools

and then vi /data/graphite/whatever.txt. This works perfectly because all the containers have the same graphite user with matching uid/gid.

Since you never mount /data/graphite from the host, you don't care how the host uid/gid maps to the uid/gid defined inside the graphite and graphitetools containers. Those containers can now be deployed to any host, and they will continue to work perfectly.

The neat thing about this is that graphitetools could have all sorts of useful utilities and scripts, that you can now also deploy in a portable manner.

UPDATE 2: After writing this answer, I decided to write a more complete blog post about this approach. I hope it helps.

UPDATE 3: I corrected this answer and added more specifics. It previously contained some incorrect assumptions about ownership and perms -- the ownership is usually assigned at volume creation time i.e. in the data container, because that is when the volume is created. See this blog. This is not a requirement though -- you can just use the data container as a "reference/handle" and set the ownership/perms in another container via chown in an entrypoint, which ends with gosu to run the command as the correct user. If anyone is interested in this approach, please comment and I can provide links to a sample using this approach.

Mounted Docker volume has different ownership when using Travis

Try running the docker again with this command, so the uid outside the container is propagated inside:

docker run -u `id -u`

alternative, as pointed by @anemyte:

docker run -u $(id -u)

This should involve the creation of the new files inside the docker to be owned by "jovyan".

If you are able to guess that mounting points will exist, you could also pre-create them so the ownership of the files inside is also correct:

docker run -v /path/on/host:/path/in/container ...

If you set the permissions of your local path (/path/on/host) as 777, that will also be propagated to the mounting point: no permission error will be thrown regardless of the user that docker uses to create those files.

After that, you'll be free to restore permissions, if needed.

Understanding User File Ownership in Docker: How to Avoid Changing Permissions of Linked Volumes