Self-deployed FaaS with Docker Swarm

Serverless is all the rage right now. Instead of maintaining a server and its infrastructure, you can create self-contained functions that do the job.

This can be a boon for front-end developers. A bespoke front-end client written in React.js or another framework can easily be enhanced with back-end code.

It’s now trivial to add a secure integration with a payment provider like Stripe.
You don’t have to write, deploy and maintain separate back-end code (e.g., a Node.js and Express.js server).

One of the often-touted advantages of FaaS (Functions as a service) is the price. You only pay for what you’re using.

Some providers, like Netlify, even offer a generous free tier.

But what happens when you want to avoid lock-in?

That’s where OpenFaaS comes in. OpenFaaS is an open-source framework for functions-as-a-service, using Docker Swarm, Kubernetes or OpenShift.

We can deploy OpenFaaS to a VPS (Virtual Private Server) provider like Upcloud1 or DigitalOcean1. A small instance, which costs around $3 to $5 US dollars, is enough for a self-hosted FaaS setup.

In this post I will show you how to use Docker Swarm to deploy an OpenFaaS cluster. We’ll also use Traefik 2 to provision SSL certificates for our functions for free.

traefik 2 architecture image from the official Traefik website

Prerequisites

You will need a running cloud server.

I particularly like Hetzner and Upcloud, but you can use what you like.

Upcloud1 has top-notch performance for the 5-dollar-tier and a friendly customer support.

Hetzner has very cheap offerings under USD $3. Their smallest offering is not very performant, but sufficient for our use case. Keep in mind that Hetnzer’s servers are located in Germany and Finland.

For alternatives, check VPSBenchmarks.

Provision an Ubuntu 20.04 server, create and connect an SSH key, and don’t forget about basic security.

You’ll also want to have a domain. Namecheap is one of the services I use for buying domains, and so far I’ve been satisfied with them.

You need to find a way to connect your domain to the cloud server you just spun up. How you do it depends on your domain registrar and your VPS provider.

You’ll need 3 subdomains for the following services: Traefik dashboard, OpenFaaS dashboard, Prometheus dashboard.

Let’s say your domain name is myfaas.xyz.
Create A records for traefik.myfaas.xyz, faas.myfaas.xyz and prometheus.myfaas.xzy.
(The names for the subdomains are up to you, of course).

Here’s a guide that might help you out.

Software Installation

I suggest using a non-root user with sudo access instead of the default root user.

SSH into your cloud server and install the necessary packages:

# Install Docker (https://docs.docker.com/engine/install/ubuntu/)
sudo DEBIAN_FRONTEND=noninteractive apt -y install \
  apt-transport-https \
  ca-certificates \
  curl \
  git \
  gnupg-agent \
  software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io

Give the current user the necessary permissions:

sudo groupadd -f docker
sudo usermod -aG docker "${USER}"

Log out of the session, so that the changes to can take place.

Install OpenFaaS and faas-cli

We’ll be using Docker Swarm for deployment.

The default option is Kubernetes because of its richer ecosystem. But right now, I have no clue how Kubernetes works. And this guide shows you a setup for hobby projects, so Kubernetes would be overkill anyway.
I’m familiar with Docker, so Docker Swarm it is.

You can also use faasd, a simpler alternative. faasd uses containerd instead of Docker.
But although Docker uses containerd, they are not the same.
That means that automatic SSL certificate generation via Traefik 2 does not work. Traefik works with a few providers like Docker or Kubernetes, but not with bare-bones containerd.


Edit: What you can do is add Terraform (infrastructure as code) and use it to install Caddy (a server) to provision SSL certificates. Guide here.
Thanks to Alex Ellis, creator of OpenFaaS & faasd, for pointing that out.


We’ll now use the official deployment guide for Docker Swarm on the OpenFaaS docs.

1. Install faas-cli

curl -sL https://cli.openfaas.com | sudo sh

2. Docker Swarm Setup

In preparation for the Traefik 2 setup, we’ll create the necessary environment variables. The following section is more or less a copy of the guide from dockerswarm.rocks.

docker swarm init

We will use the docker socket proxy by Tecnativa to protect our Docker unix socket.
The socket proxy is a security measure that’s encouraged by Traefik.

Get the Swarm node ID of the current node and store it in an environment variable:

export NODE_ID=$(docker info -f '{{.Swarm.NodeID}}')

Now we’ll add a tag to the node to ensure that we’ll always deploy to the same node. Why?
We need a way to ensure that the SSL certificates are available to the Docker node that runs Traefik (which handles the provision of the certificates).

docker node update --label-add traefik-public.traefik-certificates=true $NODE_ID

Now we need to find a way to create an admin username and password for both the Traefik admin dashboard and the Prometheus monitoring service.
OpenFaaS will automatically provide basic auth for the functions dashboard.

In my example, I will use environment variables. You can also use docker secrets, which is a more secure approach. But it’s also trickier to set up and harder to change as Docker secrets are immutable.

export TRAEFIK_ADMINS=admin:$(openssl passwd -apr1)

Here, admin is your desired username for login. Change it to something else if you like. You will be prompted for a password. The command will hash the password for you.

We’ll do the same for the Prometheus dashboard:

export PROMETHEUS_ADMINS=admin:$(openssl passwd -apr1)

We will use Let’s Encrypt to generate SSL certificates. Provide an email as unique identifier:

export EMAIL=myemail@email.com

Next up, we’ll add the domains for the services. Assuming we want to deploy the Traefik dashboard to traefik.myfaas.xyz, we create the environment variable like this:

export TRAEFIK_DOMAIN=traefik.myfaas.xyz

Don’t forget the Prometheus dashboard and also the domain for the functions:

export PROMETHEUS_DOMAIN=prometheus.myfaas.xyz
export FAAS_DOMAIN=functions.myfaas.xyz

3. Download OpenFaaS

You should be in the home directory of your (non-root) user on your VPS.
Let’s use git to get a copy of OpenFaaS:

git clone https://github.com/openfaas/faas && cd faas

4. Prepare for Traefik and Docker Socket Proxy

The OpenFaaS stack provides you with a docker-compose.yml. We now need to tweak it for our needs.

This step is quite tricky, because it’s easy to get indentation errors when editing yaml files.

First, open the ready-made docker-compose.yml file and take a look around. You will find several services for the OpenFaaS gateway, Prometheus, etc.

4.1. Adjust Existing Configuration

The default network is functions. We’ll need to add another network for the services that Traefik should make public (the Traefik dashboard, the Prometheus dashboard and the OpenFaaS dashboard).

Find the networks section at the end of the file and add a new network. Also add a volume forThe SSL certificates:

networks:
+    traefik-public:
+        driver: overlay
+        attachable: true
    functions:
        driver: overlay
        attachable: true
        labels:
            - "openfaas=true"

+volumes:
+    traefik-public:

Add the traefik-public network to all service that we can reach via the internet: gateway, prometheus, and later traefik.

Also remove the external facing ports from the services gateway and prometheus. Traefik will handle that for us:

  gateway:
        image: openfaas/gateway:0.18.18
-        ports:
-            - 8080:8080
        networks:
             - functions
+            - traefik-public

...


  prometheus:
      image: prom/prometheus:v2.11.0
      ...
-      ports:
-          - 9090:9090
      networks:
           - functions
+          - traefik-public

Let’s now setup Traefik and the Docker socket proxy services:

services:
+  socket-proxy:
+    image: 'tecnativa/docker-socket-proxy:latest'
+    environment:
+      # required permissions
+      NETWORKS: 1
+      SERVICES: 1
+      TASKS: 1
+      POST: 1
+    deploy:
+      restart_policy:
+        condition: on-failure
+      placement:
+        constraints:
+          - node.role == manager
+    networks:
+      - traefik-public

Let’s use the security-enhanced socket in the faas-swarm service:

  faas-swarm:
    image: 'openfaas/faas-swarm:0.9.0'
    networks:
      - functions
+     - traefik-public
    environment:
      read_timeout: 5m5s
      write_timeout: 5m5s
      DOCKER_API_VERSION: '1.30'
+     DOCKER_HOST: 'tcp://traefik-public:2375'
      basic_auth: '${BASIC_AUTH:-false}'

The Traefik setup is significantly more complicated. Explaining all the moving parts is beyond the scope of this post.

I suggest these two posts and the official Traefik documentation to get started.

+    traefik:
+        image: traefik:v2.2
+        command:
+          - --entrypoints.web.address=:80
+          - --entrypoints.websecure.address=:443
+          - --providers.docker=true
+          - --providers.docker.swarmMode=true
+          - --providers.docker.network=faas_traefik-public
+          - --providers.docker.exposedByDefault=false
+          - --providers.docker.endpoint=tcp://socket-proxy:2375
+          - --api.dashboard=true
+          - --log.level=ERROR
+          - --certificatesresolvers.letsencrypt.acme.email=${EMAIL?Variable not defined}
+          - --certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json
+          - --certificatesresolvers.letsencrypt.acme.tlschallenge=true
+       deploy:
+         restart_policy:
+           condition: on-failure
+         placement:
+           constraints:
+             - node.labels.traefik-public.traefik-certificates == true
+             - node.role == manager
+         labels:
+             # Redirect to https
+             - traefik.http.routers.http-catchall.rule=hostregexp(`{host:.+}`)
+             - traefik.http.routers.http-catchall.entrypoints=web
+             - traefik.http.routers.http-catchall.middlewares=redirect-to-https
+             - traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https
+             - traefik.http.middlewares.redirect-to-https.redirectscheme.permanent=true
+
+             # Dashboard
+             - traefik.http.routers.api.entrypoints=websecure
+             - traefik.http.routers.api.rule=(Host(`${TRAEFIK_DOMAIN?Variable not defined}`)
+             - traefik.http.routers.api.middlewares=api-auth
+             - traefik.http.middlewares.api-auth.basicauth.users=${TRAEFIK_ADMINS?Variable not set}
+             - traefik.http.routers.api.service=api@internal
+             - traefik.http.services.gateway.loadbalancer.server.port=8080
+           ports:
+             - 80:80
+             - 443:443
+           networks:
+             - traefik-public
+           volumes:
+               - traefik-certificates:/letsencrypt
+               - /home/$USER/traefik/traefik_conf:/traefik_conf

Finally, we’ll need to tell Traefik which containers it should make publicly available. For that feat, we must add labels to the services gateway and prometheus:

    gateway:
        image: openfaas/gateway:0.18.18
        ...
        deploy:
+           labels:
+               - traefik.enable=true
+               - traefik.constraint-label=traefik-public
+               - traefik.http.routers.gateway.entrypoints=websecure
+               - traefik.http.routers.gateway.rule=Host(`${FAAS_DOMAIN?Variable not set`)
+               - traefik.http.routers.gateway.tls.certresolver=letsencrypt
+               - traefik.http.services.gateway.loadbalancer.server.port=8080

The same for the Prometheus dashboard:

    prometheus:
    ...
        deploy:
+           labels:
+               - traefik.enable=true
+               - traefik.constraint-label=traefik-public
+               - traefik.http.routers.gateway.entrypoints=websecure
+               - traefik.http.routers.gateway.rule=Host(`${PROMETHEUS_DOMAIN?Variable not set`)
+               - traefik.http.routers.gateway.tls.certresolver=letsencrypt
+               - traefik.http.services.gateway.loadbalancer.server.port=9090

4.2. Final Configuration

This is the final docker-compose.yml.

I’ve updated the docker version to 3.8 and ran the file through a yaml prettifier to avoid indentation errors.

version: '3.8'
services:
  socket-proxy:
    image: 'tecnativa/docker-socket-proxy:latest'
    environment:
      NETWORKS: 1
      SERVICES: 1
      TASKS: 1
      POST: 1
    deploy:
      restart_policy:
        condition: on-failure
      placement:
        constraints:
          - node.role == manager
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock:ro'
    networks:
      - traefik-public
  traefik:
    image: 'traefik:v2.2'
    restart: unless-stopped
    command:
      - '--entrypoints.web.address=:80'
      - '--entrypoints.websecure.address=:443'
      - '--providers.docker=true'
      - '--providers.docker.swarmMode=true'
      - '--providers.docker.endpoint=tcp://socket-proxy:2375'
      - '--providers.docker.network=faas_traefik-public'
      - '--providers.docker.exposedByDefault=false'
      - '--api.dashboard=true'
      - '--log.level=ERROR'
      - >-
        --certificatesresolvers.letsencrypt.acme.email=${EMAIL?Variable not
        defined}
      - '--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json'
      - '--certificatesresolvers.letsencrypt.acme.tlschallenge=true'
    deploy:
      restart_policy:
        condition: on-failure
      placement:
        constraints:
          - node.labels.traefik-public.traefik-certificates == true
          - node.role == manager
      labels:
        - 'traefik.http.routers.http-catchall.rule=hostregexp(`{host:.+}`)'
        - traefik.http.routers.http-catchall.entrypoints=web
        - traefik.http.routers.http-catchall.middlewares=redirect-to-https
        - traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https
        - >-
          traefik.http.middlewares.redirect-to-https.redirectscheme.permanent=true
        - traefik.http.routers.api.entrypoints=websecure
        - >-
          traefik.http.routers.api.rule=(Host(`${TRAEFIK_DOMAIN?Variable not
          defined}`)
        - traefik.http.routers.api.middlewares=api-auth
        - >-
          traefik.http.middlewares.api-auth.basicauth.users=${TRAEFIK_ADMINS?Variable
          not defined}
        - traefik.http.routers.api.service=api@internal
        - traefik.http.services.gateway.loadbalancer.server.port=8080
    ports:
      - '80:80'
      - '443:443'
    networks:
      - traefik-public
    volumes:
      - 'traefik-certificates:/letsencrypt'
      - '/home/$USER/traefik/traefik_conf:/traefik_conf'
  gateway:
    image: 'openfaas/gateway:0.18.18'
    networks:
      - functions
      - traefik-public
    environment:
      functions_provider_url: 'http://faas-swarm:8080/'
      read_timeout: 5m5s
      write_timeout: 5m5s
      upstream_timeout: 5m
      dnsrr: 'true'
      faas_nats_address: nats
      faas_nats_port: 4222
      direct_functions: 'true'
      direct_functions_suffix: ''
      basic_auth: '${BASIC_AUTH:-false}'
      secret_mount_path: /run/secrets/
      scale_from_zero: 'true'
      max_idle_conns: 1024
      max_idle_conns_per_host: 1024
      auth_proxy_url: '${AUTH_URL:-}'
      auth_proxy_pass_body: 'false'
    deploy:
      labels:
        - traefik.enable=true
        - traefik.constraint-label=traefik-public
        - traefik.http.routers.gateway.entrypoints=websecure
        - >-
          traefik.http.routers.gateway.rule=Host(`${FAAS_DOMAIN?Variable not
          defined}`)
        - traefik.http.routers.gateway.tls.certresolver=letsencrypt
        - traefik.http.services.gateway.loadbalancer.server.port=8080
      resources:
        reservations:
          memory: 100M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 20
        window: 380s
      placement:
        constraints:
          - node.platform.os == linux
    secrets:
      - basic-auth-user
      - basic-auth-password
  basic-auth-plugin:
    image: 'openfaas/basic-auth-plugin:0.18.18'
    networks:
      - functions
    environment:
      secret_mount_path: /run/secrets/
      user_filename: basic-auth-user
      pass_filename: basic-auth-password
    deploy:
      placement:
        constraints:
          - node.role == manager
          - node.platform.os == linux
      resources:
        reservations:
          memory: 50M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 20
        window: 380s
    secrets:
      - basic-auth-user
      - basic-auth-password
  faas-swarm:
    volumes:
      - '/var/run/docker.sock:/var/run/docker.sock:ro'
    image: 'openfaas/faas-swarm:0.9.0'
    networks:
      - functions
    environment:
      read_timeout: 5m5s
      write_timeout: 5m5s
      DOCKER_API_VERSION: '1.30'
      basic_auth: '${BASIC_AUTH:-false}'
      secret_mount_path: /run/secrets/
    deploy:
      placement:
        constraints:
          - node.role == manager
          - node.platform.os == linux
      resources:
        reservations:
          memory: 100M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 20
        window: 380s
    secrets:
      - basic-auth-user
      - basic-auth-password
  nats:
    image: 'nats-streaming:0.17.0'
    command: '--store memory --cluster_id faas-cluster'
    networks:
      - functions
    deploy:
      resources:
        limits:
          memory: 125M
        reservations:
          memory: 50M
      placement:
        constraints:
          - node.platform.os == linux
  queue-worker:
    image: 'openfaas/queue-worker:0.11.2'
    networks:
      - functions
    environment:
      max_inflight: '1'
      ack_wait: 5m5s
      basic_auth: '${BASIC_AUTH:-false}'
      secret_mount_path: /run/secrets/
      gateway_invoke: 'true'
      faas_gateway_address: gateway
    deploy:
      resources:
        limits:
          memory: 50M
        reservations:
          memory: 20M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 20
        window: 380s
      placement:
        constraints:
          - node.platform.os == linux
    secrets:
      - basic-auth-user
      - basic-auth-password
  prometheus:
    image: 'prom/prometheus:v2.11.0'
    environment:
      no_proxy: gateway
    configs:
      - source: prometheus_config
        target: /etc/prometheus/prometheus.yml
      - source: prometheus_rules
        target: /etc/prometheus/alert.rules.yml
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
    networks:
      - functions
      - traefik-public
    deploy:
      labels:
        - traefik.enable=true
        - traefik.constraint-label=traefik-public
        - traefik.http.routers.gateway.entrypoints=websecure
        - >-
          traefik.http.routers.gateway.rule=Host(`${PROMETHEUS_DOMAIN?Variable
          not defined}`)
        - traefik.http.routers.gateway.tls.certresolver=letsencrypt
        - traefik.http.services.gateway.loadbalancer.server.port=8080
      placement:
        constraints:
          - node.role == manager
          - node.platform.os == linux
      resources:
        limits:
          memory: 500M
        reservations:
          memory: 200M
  alertmanager:
    image: 'prom/alertmanager:v0.18.0'
    environment:
      no_proxy: gateway
    command:
      - '--config.file=/alertmanager.yml'
      - '--storage.path=/alertmanager'
    networks:
      - functions
    deploy:
      resources:
        limits:
          memory: 50M
        reservations:
          memory: 20M
      placement:
        constraints:
          - node.role == manager
          - node.platform.os == linux
    configs:
      - source: alertmanager_config
        target: /alertmanager.yml
    secrets:
      - basic-auth-password
configs:
  prometheus_config:
    file: ./prometheus/prometheus.yml
  prometheus_rules:
    file: ./prometheus/alert.rules.yml
  alertmanager_config:
    file: ./prometheus/alertmanager.yml
networks:
  traefik-public:
    driver: overlay
    attachable: true
  functions:
    driver: overlay
    attachable: true
volumes:
  traefik-public: null
secrets:
  basic-auth-user:
    external: true
  basic-auth-password:
    external: true

Deploy Functions

You can deploy new functions via the dashboard. It’s available under the url of the FAAS_DOMAIN environment variable.

openfaas ui image from the official OpenFaaS website

OpenFaaS automatically created an admin user with password. You should have seen the credentials in your terminal. If don’t have have them, check the troubleshooting guide for forgetting the gateway password.

You can also use the faas-cli command-line utility. It offers pre-made templates and convenience wrappers for pushing to a container registry.

More info on the official website.

Final Thoughts

Docker Swarm, Traefik 2 and OpenFaaS play very well together. OpenFaaS offers you an agnostic framework for serverless functions without vendor lock-in. The faas-cli and the dashboard are very user-friendly.

Traefik 2 shows all its power as a load-balancer and edge router with automatic provisioning of SSL certificates via Let’s Encrypt. It plays well with the OpenFaaS architecture.

You should give it a try.

Further Reading


  1. affiliate links - you’ll get free credit and I also get some - thank you! ↩︎