Advanced Traefik 2 Setup with Docker Swarm, SSL Certificates and Security Options

Traefik is an open-source router and load-balancer that sits in front of your web services. You can set it up to automatically encrypt your websites with SSL certificates. It’s also easy to add new web services to an existing Traefik cluster.

I discovered Traefik via Jakub Svehla’s post Building a Heroku-like infrastructure for $5 a month. He shows you how to use Docker to install a Traefik infrastructure on a cheap VPS like DigitalOcean. With his explanations, you can easily deploy new projects to your VPS on your domain.

Jakub uses plain Docker with docker-compose files. While that’s a feasible option, I dislike using the docker-compose command for production, as it’s a tool for local development.

You can use docker-compose files in production, but only with using docker stack and Docker swarm clusters. Further reading: The Difference Between Docker Compose And Docker Stack.

Using Docker in swarm mode adds a new layer of complexity.

In this post, I will show you a working example of Traefik 2 in Docker swarm mode with a docker socket proxy, running the Traefik dashboard on a different port, basic auth and IP whitelisting.

This post is not for beginners, but for those who got a basic setup working, but can’t figure out how to tie all pieces together.

Getting Started

Here are some articles that you will need for basic understanding of Traefik 2 and Docker Swarm:

Why Docker Socket?

Security concerns, see: Protecting Your Docker Socket With Traefik 2.

Why is The Dashboard on Port 8000?

Security concerns. My setup uses port 8000 for the dashboard and IP whitelisting. Only my public IP can reach the dashboard, and only on port 8000. That already minimizes the risk of exposure.

If you try to reach the dashboard from a different IP, you can’t reach the login middleware(403: Forbidden).

The dashboard also has basic auth.

Additionally, there is a rate limit in place.

Working Example

Docker Swarm Init

Follow the guide on dockerswarm.rocks to create a Docker swarm cluster.

Create Networks

I use 3 networks:

  1. cloud-edge (for Traefik 2)
  2. cloud-public (for all web services that I want to expose to the internet)
  3. cloud-socket-proxy (for the docker socket proxy)

Remove default ingress network and re-create it with encryption:

docker network create --ingress --driver overlay \
   --opt encrypted --subnet 10.10.0.0./16 ingress

Add the two other networks as overlay networks:

# host network for outside of docker
docker network create --subnet 10.11.0.0/16 --driver overlay \
  --scope swarm --opt encrypted --attachable cloud-edge
# network hosting the socket proxy
docker network create --subnet 10.12.0.0/16 --driver overlay \
  --scope swarm --opt encrypted --attachable cloud-socket-proxy
# network hosting the services that are routed by traefik
docker network create --subnet 10.13.0.0/16 --driver overlay \
  --scope swarm --opt encrypted --attachable cloud-public

Add Node Labels And Environment Variables

The following steps are from the superb guide on dockerswarm.rocks. We create a node label to make sure that Traefik will deploy on the same node that has the volume for the SSL certificates.

export NODE_ID=$(docker info -f '{{.Swarm.NodeID}}')
docker node update --label-add cloud-public.traefik-certificates=true $NODE_ID

You also have to create environment variables for DOMAIN, USERNAME, EMAIL, TRAEFIK_ADMINS and WHITELIST_IP.

The Traefik admin dashboard uses a username and password for authentication. We need to hash the password, using the tool htpasswd.

On Ubuntu, you can install it via the apache2-utils:

sudo apt update && sudo apt install apache2-utils

I use the WHITELIST_IP environment variable to limit access to the admin dashboard:

diagram showing the ip whitelist flow image from the official Traefik website

I allow my public home address IP. You can get that by going to https://icanhazip.com/ in the browser of your local computer.

Here are example entries for creating the environment variables in your terminal:

export DOMAIN="<domain here>"
export EMAIL="<email for letsencrypt certificates here>"
export EMAIL="<your public ip>/32"
export USERNAME="<username here>"
export TRAEFIK_ADMINS=$(htpasswd -nBC 10 $USERNAME)

htpasswd will prompt you for a password and will create a hashed username-password-combination.

TLS

Traefik 2 is extremely versatile. It allows several options for customization: Docker labels, TOML or YAML files, and more.

So far, we’ve opted for using Docker labels.
I like this option because my docker-compose.yml is the single source of truth. It configures both my Docker setup as well as how Traefik works.

Unfortunately, Traefik isn’t able to configure all settings via labels. TLS (Transport Layer Security) is one of them.

For security purposes I want to enable a minimum TLS version. TLS 1.0 & TLS 1.1 are known to be vulnerable, thus we want to avoid them.

Traefik 2 distinguishes between static and dynamic configuration. The startup configuration is called the static configuration and can be done via command-line options in the docker command section.
The dynamic configuration is used for routing and other options and depends on the provider (Docker, Kubernetes, etc.).

TLS options are part of the routing configuration. It follows that they are part of the dynamic configuration.

For Docker, it’s only possible to use the file provider as a method to configure TLS.
More info here.

Create a folder traefik_conf with a file called dynamic_conf.toml.
You need to bind mount the file, so that Docker can access it.

Here’s how the dynamic_conf.toml should look like:

[tls.options]
  [tls.options.default]
    minVersion = "VersionTLS13"
    sniStrict = true

  [tls.options.tls12]
    minVersion = "VersionTLS12"
    sinStrict = true
    cipherSuites = [
      "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305",
      "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305"
    ]

Docker Compose

Here is the final docker-compose.yml:

version: '3.8'

# volume for the SSL certificates from Let's Encrypt
volumes:
  traefik-certificates:

networks:
  cloud-edge:
    external: true
  cloud-public:
    external: true
  cloud-socket-proxy:
    external: true

services:
  reverse-proxy:
    image: traefik:v2.2
    command:
      - --providers.docker
      # Use the secure docker socket proxy
      - --providers.docker.endpoint=tcp://socket-proxy:2375
      # Add a constraint to only use services with the label "traefik.constraint-label=cloud-public"
      - --providers.docker.constraints=Label(`traefik.constraint-label`, `cloud-public`)
      # Don't expose containers per default
      - --providers.docker.exposedByDefault=false
      - --providers.docker.swarmMode=true
      # fileprovider needed for TLS configuration
      # see https://github.com/containous/traefik/issues/5507
      - --providers.file.filename=traefik_conf/dynamic_conf.toml
      # Entrypoints (ports) for the routers
      - --entrypoints.web.address=:80
      - --entrypoints.websecure.address=:443
      # Entrypoint for the dashboard on port 8000
      - --entrypoints.api.address=:8000
      # Create the certificate resolver "letsencrypt" for Let's Encrypt, uses the environment variable EMAIL
      - --certificatesresolvers.letsencrypt.acme.email=${EMAIL?Variable not set}
      - --certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json
      - --certificatesresolvers.letsencrypt.acme.tlschallenge=true
      # Only for development to avoid hitting the rate limit on certificates
      - --certificatesresolvers.letsencrypt.acme.caServer=https://acme-staging-v02.api.letsencrypt.org/directory
      # Logging
      - --accesslog
      - --log.level=debug
      # Enable the dashboard
      - --api
    deploy:
      restart_policy:
        condition: on-failure
      placement:
        constraints:
          - node.labels.cloud-public.traefik-certificates == true
          - node.role == manager
      labels:
        # traefik.enable is required because we don't expose all containers automatically
        - traefik.enable=true
        - traefik.docker.network=cloud-public
        - traefik.constraint-label=cloud-public

        # Global redirection: HTTP to HTTPS
        - traefik.http.routers.http-redirects.entrypoints=web
        - traefik.http.routers.http-redirects.rule=hostregexp(`{host:(www\.)?.+}`)
        - traefik.http.routers.http-redirects.middlewares=traefik-ratelimit,redirect-to-non-www-https

        # Global redirection: HTTPS www to HTTPS non-www
        - traefik.http.routers.www-redirects.entrypoints=websecure
        - traefik.http.routers.www-redirects.rule=hostregexp(`{host:(www\.).+}`)
        - traefik.http.routers.www-redirects.tls=true
        - traefik.http.routers.www-redirects.tls.options=default
        - traefik.http.routers.www-redirects.middlewares=traefik-ratelimit,redirect-to-non-www-https

        # Middleware to redirect to bare https
        - traefik.http.middlewares.redirect-to-non-www-https.redirectregex.regex=^https?://(?:www\.)?(.+)
        - traefik.http.middlewares.redirect-to-non-www-https.redirectregex.replacement=https://$${1}
        - traefik.http.middlewares.redirect-to-non-www-https.redirectregex.permanent=true

        # Dashboard on port 8000
        - traefik.http.routers.api.entrypoints=api
        - traefik.http.routers.api.rule=Host(`${DOMAIN?Variable not set}`)
        - traefik.http.routers.api.service=api@internal
        - traefik.http.routers.api.tls=true
        - traefik.http.routers.api.tls.options=default
        - traefik.http.routers.api.tls.certresolver=letsencrypt
        # middlewares: use IP whitelisting, ratelimit and basic authentication
        - traefik.http.routers.api.middlewares=api-ipwhitelist,traefik-ratelimit,api-auth
        - traefik.http.middlewares.api-auth.basicauth.users=${TRAEFIK_ADMINS?Variable not set}
        # whitelist your public ip, see https://icanhazip.com
        # replace with _your IP_
        - traefik.http.middlewares.api-ipwhitelist.ipwhitelist.sourcerange=${WHITELIST_IP?Variable not set}
        - traefik.http.services.api.loadbalancer.server.port=8000

        # Extra middleware (ratelimit, ip whitelisting)
        - traefik.http.middlewares.traefik-ratelimit.ratelimit.average=100
        - traefik.http.middlewares.traefik-ratelimit.ratelimit.burst=50
    # use host mode for network ports for ip whitelisting
    # see https://community.containo.us/t/whitelist-swarm-cant-get-real-source-ip/3897
    ports:
      - target: 80
        published: 80
        protocol: tcp
        mode: host
      - target: 443
        published: 443
        protocol: tcp
        mode: host
      - target: 8000
        published: 8000
        protocol: tcp
        mode: host
    volumes:
      # storage for the SSL certificates
      - traefik-certificates:/letsencrypt
      # bind mount the directory for your traefik configuration
      - /home/$USER/traefik_conf:/traefik_conf
    networks:
      - cloud-edge
      - cloud-public
      - cloud-socket-proxy

  socket-proxy:
    image: tecnativa/docker-socket-proxy:latest
    deploy:
      restart_policy:
        condition: on-failure
      placement:
        constraints: [node.role == manager]
    environment:
      # permssions needed
      NETWORKS: 1
      SERVICES: 1
      TASKS: 1
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      cloud-socket-proxy:
        aliases:
          - socket-proxy

Please note that the labels entries are under the entry deployment. This structure is needed for deploying Docker in swarm mode.

Deploy on your VPS with docker stack deploy -c <name-of-your-swarm> (for example: traefik).

Let’s say your domain is my.traefik.com. You can reach your dashboard at https://my.traefik.com:8000/dashboard/ (the trailing slash is important).

For deployment, remove the following line:

- -- certificatesresolvers.letsencrypt.acme.caServer=https://acme-staging-v02.api.letsencrypt.org/directory`

You need the staging server for development to avoid hitting the limit on Let’s Encrypt.
For production, we want to use the real server.
You’ll probably have to remove the bound volume (traefik-certifcates) as well. Docker will “cache” the development certificates.

I suggest accessing your server (via SSH) and run:

docker volume prune

Deploy New Services

Let’s say you want to deploy a new container on the url whoami.mytraefik.com.

Use your web provider to add a new A record. How you do that depends on your domain provider. Here’s a guide for DigitalOcean.

Add a new docker compose file.

Example whoami.yml:

version: '3.8'

services:
  whoami:
    image: containous/whoami:latest
    command:
      - --port=8082
    deploy:
      labels:
        # traefik.enable=true and constraint label are needed because
        # of the restrictions we enforced in our traefik configuration
        - traefik.enable=true
        - traefik.constraint-label=cloud-public
        - traefik.http.routers.whoami.entrypoints=websecure
        - traefik.http.routers.whoami.rule=Host(`whoami.mytraefik.com`)
        - traefik.http.routers.whoami.tls=true
        # min TLS version
        - traefik.http.routers.whoami.tls.options=tls12@file
        - traefik.http.routers.whoami.tls.certresolver=letsencrypt
        - traefik.http.routers.whoami.middlewares=traefik-ratelimit
        - traefik.http.services.whoami.loadbalancer.server.port=8082
    networks:
      - cloud-public

networks:
  cloud-public:
    external: true

Make sure to add a loadbalancer.server.port for each service. Traefik 2 is not able to detect ports automatically, thus you have to expose the service’s port with this label.

As you can see, we can use the middleware and TLS options from the original docker-compose file where we configured Traefik. Although the whoami service uses a different file (whoami.yaml), Traefik 2 is able to pick up the configuration. Neat!

Deploy:

docker stack deploy -c whoami.yaml <name-of-your-swarm>

Docker stack will add the new service to the existing stack and will re-use the configuration from your main traefik installation.

Final Thoughts

Traefik 2 Setup is very complicated. The web is full of examples where people can’t figure out how to configure it.
It doesn’t help that Traefik v1 is quite different from the current version. You will still find a lot of material that’s hopelessly outdated.

There are a lot of moving parts: Docker, Docker swarm mode, networking, volume binding, static and dynamic configuration of Traefik 2, middlewares, routers, etc.

Traefik is a powerful piece of tech, but you will need a lot of time to understand how to configure it.

The documentation is good, so don’t forget to read it carefully.
You will also need a good understanding of Docker or a willingness to dive into its documentation.