Docker In-depth: Volumes
03 Nov 2014One of the most common roadblocks for people using Docker, and indeed easily the most common questions I see on on various Docker support channels, is regarding the use of volumes.
So let's take a closer look at how volumes work in Docker.
First, let's dispell the most common and first misconception:
Docker volumes are for persistence.
This likely comes from the idea that container's are not persitant, which is indeed not true. Container's persist until you remove them, and you can only do that by doing
docker rm my_container
If you did not type this command then your container still exists and will continue to exist, can be started, stopped, etc. If you do not see your container, you should see this:
docker ps -a
docker ps
only ever shows you running containers, but a container can be in a
stopped state, in which case the above command would show you all containers
regardless of state. docker run ...
is actually a multi-part command, it
creates a new container, then starts it.
So, again, volumes are not for persitance.
What is a volume
Volumes decouple the life of the data being stored in them from the life of the
container that created them. This makes it so you can
docker rm my_container
and your data will not be removed.
A volume can be created in two ways:
- Specifying
VOLUME /some/dir
in a Dockerfile - Specying it as part of your run command as
docker run -v /some/dir
Either way, these two things do exactly the same thing. It tells Docker to
create a directory on the host, within the docker root path
(by default /var/lib/docker), and mount it to the path you've specified
(/some/dir
above). When you remove the container using this volume, the
volume itself continues to live on.
If the path specified does not exist within the container, a directory will be automatically created.
You can tell docker to remove a volume along with the container:
docker rm -v my_container
Sometimes you've already got a directory on your host that you want to use in the container, so the CLI has an extra option for specifying this:
docker run -v /host/path:/some/path ...
This tells docker to use the specified host path specifically, instead of
creating one itself within the docker root, and mount that to the specified path
within the container (/some/path
above). Note, that this can also be a file
instead of a directory. This is commonly referred to as a bind-mount within
docker terminology (though technically speaking, all volumes are bind-mounts in
the sense of what is actually happening).
If the path on the host does not exist, a directory will be automatically be
created at the given path.
Bind-mount volumes are treated a little differently than a "normal" volume, with the preference of not modfying things on the host that Docker did not itself create:
- With a "normal" volume, docker will automatically copy data at the
specified volume path (e.g.
/some/path
, above) into the new directory that was created by docker, with a "bind-mount" volume this does not happen. - When you
docker rm -v my_container
a container with "bind-mount" volumes, the "bind-mount" volumes will not be removed.
You can share volumes with another container.
docker run --name my_container -v /some/path ...
docker run --volumes-from my_container --name my_container2 ...
The command above will tell docker to mount the same volumes from the first container into the 2nd container. This effectively allows you to share data between two containers.
If you docker rm -v my_container
, if the 2nd container above still exists, the
volumes will not be removed, and indeed will not ever be removed unless you
remove the second container with the same docker rm -v my_container2
.
VOLUME in Dockerfiles
As mentioned earlier, the VOLUME
declaration in a Dockerfile
does the same
exact thing as the -v
flag on the docker run
command (except you can't
specify a host path in a Dockerfile
). It just so happens that because of
this, there can be suprising effects when building your image.
Each command in a Dockerfile
creates a new container which runs the specified
command and commits the container back to an image, each step building off the
previous one. So ENV FOO=bar
in a dockerfile is the equivelant of:
cid=$(docker run -e FOO=bar <image>)
docker commit $cid
So let's look at what happens with this example Dockerfile
FROM debian:jessie
VOLUME /foo/bar
RUN touch /foo/bar/baz
docker build -t my_debian .
What we expect to happen here is docker to create an image called my_debian
with a volume at /foo/bar
and put an empty file at /foo/bar/baz
, but let's
look at the equivelant CLI commands actually do:
cid=$(docker run -v /foo/bar debian:jessie)
image_id=$(docker commit $cid)
cid=$(docker run $image_id touch /foo/bar/baz)
docker commit $(cid) my_debian
Now, this isn't exactly what happens, but it is a very close approximation.
So, what happened here is the volume is created before anything is actually in
/foo/bar
, and as such every time we start a container from this image we will
have an emtpy directory at /foo/bar
. This happens because as stated earlier,
each Dockerfile
command is creating a new container. This means a new volume
is also created. Since in the example Dockerfile
the volume is specified
before anything existed in that directory, when the container that was created
to run the touch /foo/bar/baz
command, it did so with a volume mounted in for
/foo/bar
, so baz
was written to the volume mounted at /foo/bar
, not the
actual container/image filesystem.
So, keep in mind the placement of your VOLUME
declarations in your Dockerfile
as it does create essentially immutable directories in your image.
(#8509),
docker cp
docker commit
, and docker export
do not support volumes (yet).
Currently, the only way to manage volumes (create/destroy) is during container creation/descruction, which is a little odd since volumes are meant to decouple the data contained within them from the life of the container. This is something being worked on but is not yet merged (#8484).
If you want this sort of functionality, checkout docker-volumes