Mastodon hachyterm.io

A few days ago, I created a Docker build for Flask with PostgreSQL (both with Alpine Linux and with Debian Linux).

Installing psypcopg-2 binary (required for Postgres) requires you to build the package from source.
Now the Docker image grows in size, as it still contains the build artifacts.

The solution? Multi-stage Docker builds.

Let’s say we have the following docker-compose.yml file. There are two services: a Flask API called users and a Postgres database called users-db.

version: '3.7'

services:

  users:
    build:
      context: .
      dockerfile: Dockerfile
    entrypoint: ['/usr/src/app/entrypoint.sh']
    volumes:
      - '.:/usr/src/app'
    ports:
      - 5001:5000
    environment:
      - FLASK_ENV=development
      - APP_SETTINGS=project.config.DevelopmentConfig
      - DATABASE_URL=postgresql://postgres:postgres@users-db:5432/users_dev
      - DATABASE_TEST_URL=postgresql://postgres:postgres@users-db:5432/users_test
    depends_on:
      - users-db

  users-db:
    build:
      context: ./project/db
      dockerfile: Dockerfile
    expose:
      - 5432
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres

We have a requirements.in file with the following dependencies:

Flask==1.1.1
Flask-RESTful==0.3.7
Flask-SQLAlchemy==2.4.0
psycopg2-binary==2.8.3
pytest==5.0.1

We’ll need psycopg2-binary for running PostgreSQL.

Let’s create a multi-stage Docker file that

  • compiles and builds the required build packages into a virtual environment in the first stage (compile image)
  • creates a clean fresh stage that copies the compiled code and only installs the run-time dependencies (run-time image)
## base image
FROM python:3.7.5-slim-buster AS compile-image

## install dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc

## virtualenv
ENV VIRTUAL_ENV=/opt/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

## add and install requirements
RUN pip install --upgrade pip && pip install pip-tools
COPY ./requirements.in .
RUN pip-compile requirements.in > requirements.txt && pip-sync
RUN pip install -r requirements.txt

## build-image
FROM python:3.7.5-slim-buster AS runtime-image

## install nc
RUN apt-get update && \
    apt-get install -y --no-install-recommends netcat-openbsd

## copy Python dependencies from build image
COPY --from=compile-image /opt/venv /opt/venv

## set working directory
WORKDIR /usr/src/app

## add user
RUN addgroup --system user && adduser --system --no-create-home --group user
RUN chown -R user:user /usr/src/app && chmod -R 755 /usr/src/app

## add entrypoint.sh
COPY ./entrypoint.sh /usr/src/app/entrypoint.sh
RUN chmod +x /usr/src/app/entrypoint.sh

## switch to non-root user
USER user

## add app
COPY . /usr/src/app

## set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV PATH="/opt/venv/bin:$PATH"

## run server
CMD python manage.py run -h 0.0.0.0

The first stage installs gcc, which we need to build psypcop2-binary. Then we create a virtual environment. We upgrade pip (the package manager for Python) and install pip-tools.
pip-tools is a way to pin dependencies, so you can be sure what gets mounted in your container (use with pip-compile and pip-sync).

In the second stage, we start fresh from the same base image (a Python 3.7 Debian image).
The Flask app needs netcat, so we’ll use apt to install the package.
Now it gets interesting. We’ll copy the packages from the virtual environment into the run-time stage.
The next steps show standard Docker steps: set a working directory, add a non-root user, copy the application app source, set environment variables, run the app.

Further Reading