on writing dockerfiles

December 9, 2022

I wanted to collect some of the snippets that I use for writing Dockerfiles. Obviously, there is a lot of nuance in how to structure the Dockerfile and how to build projects, so this isn't meant to cover every possibility, but instead to serve as a starting point that I can share with others.

`FROM`

A Dockerfile starts from somewhere and that starting point is the argument to the FROM instruction. There's lots of options: all the main Linux operating systems (e.g. Ubuntu, Debian, Fedora, etc) and some more niche operating systems that can optimize Docker image size and speed (e.g. Alpine).

In general, I think Ubuntu is almost always a reasonable starting point. I like to check what the latest release is and then go one version earlier. So, today (Dec 9, 2022), the Ubuntu Wikipedia article Kinetic Kudu 22.10 as the latest, so I would build the Docker image from Jammy Jellyfish 22.04 LTS.

# File: Dockerfile
FROM ubuntu:jammy

Modern Docker makes it really easy to use multiple build steps for your image. Although useful, I think it's unnecessary to use this as a beginner. To use it though, I've found the following setup to be convenient.

# File: Dockerfile
FROM ubuntu:jammy AS base

# install runtime dependencies

FROM base AS deps

# install development and software dependencies
# build and install current project

FROM base AS dist

# copy files from deps image, e.g.
COPY --from=deps /opt/libfoo /opt/libfoo

installing dependencies with `apt`

Each operating system has its own method of installing packages. For Ubuntu, the following boilerplate can be used and adapted for each package you want to install.

# File: Dockerfile
FROM ubuntu:jammy

# Tell apt that the install is automated and we aren't directly interacting.
# Only needed once.
ENV DEBIAN_FRONTEND=noninteractive

# Always update the apt index and then always remove any package caches.
RUN apt-get update && \
    apt-get install -y \
        build-essential \
        libpng-dev \
    && rm -rf /var/lib/apt/lists/*

getting source code into build

There's two main ways of getting source code into your build: using git to clone repositories during the build and using git to clone repositories before the build. Docker has some capabilities to make the “cloning during the build” step easier, namely the RUN --mount=git option. However, I think it's generally easier to just clone outside before the build step.

To do this, you can use git submodules or any other method of getting the code accessible locally. The main idea is that from your parent repository, you want something that looks like the following.

host$ tree
.
├── .git
└── libfoo
    └── .git

Then in your Dockerfile, you'd want to include a COPY command.

# File: Dockerfile
FROM ubuntu:jammy

# Make /opt directory if it doesn't exist; make /opt/src directory if it doesn't exist; then run all future commands from /opt/src
WORKDIR /opt/src

# Copy the host's entire libfoo directory into /opt/src/libfoo
COPY ./libfoo ./libfoo

building with `cmake`

I like to use the following snippet for building CMake projects.

# File: Dockerfile
FROM ubuntu:jammy

# Build the source code from "-H"ere
# Write the "-B"uild files here
# For extra paranoia points: remove the build directory afterwards to avoid bloating the image
RUN cmake \
        -H/opt/src/libfoo \
        -B/opt/src/libfoo/build \
        -DCMAKE_INSTALL_PREFIX:PATH=/opt/libfoo \
        -DCMAKE_BUILD_TYPE:STRING=RelWithDebInfo \
    && \
    cmake \
        --build /opt/src/libfoo/build \
        --verbose \
        --parallel \
    && \
    cmake \
        --install /opt/src/libfoo/build \
        --verbose \
    && rm -rf /opt/src/libfoo/build

`ENTRYPOINT`s and `CMD`s

Docker lets you configure the entrypoint (normally something like /bin/bash -c) and the default arguments to that entrypoint (normally no arguments, empty). Although tempting and sometimes useful, I don't think a beginner will ever need to mess with these values.

There is one useful usage of ENTRYPOINT and that's when your Docker image is only meant to run a single application ever and if that application isn't your own. For example, ImageMagick could be convenient to use directly from a container (it's not a great example because ImageMagick includes multiple executables: convert, identify, etc).

# File: Dockerfile
# ...
ENTRYPOINT ["convert"]

With that setup, you could replace every normal use of convert in a shell script with docker run -it --rm --mount type=bind,src=$PWD,dst=$PWD my-convert:latest. It's still not great, admittedly.

other things

Some other things that don't fit in better elsewhere.

In most cases, try not to use mkdir when you could instead use WORKDIR.
In most cases, don't use multi-stage (multiple FROM) builds.
It can sometimes be useful to add a custom user for a Docker image. In most cases, I've found it better to leave the Docker image with the default root user, and instead use your own user when running. For example: docker run -u $(id -u):$(id -g) ...
I haven't found a good use for the VOLUME command in most of my work. In theory, it's good to document what things can and should be mounted to the host filesystem, but it's rare that that affects me.
Likewise, EXPOSE is good for documentation but I don't find it that useful.