cover image

March 8, 2022

A Vercel-like PaaS beyond Jamstack with Kubernetes and GitOps, part III

Applications and the Dockerfile


This article is the third part of the A Vercel-like PaaS beyond Jamstack with Kubernetes and GitOps series.

A Vercel-like PaaS beyond Jamstack with Kubernetes and GitOps


In part I, I've set up a Kubernetes cluster with k0s. Then in part II, I've configured a GitLab pipeline to build Docker images and deploy applications on this cluster.

Now I'm going to write the required Dockerfile to build those Docker images.

  1. Introduction
  2. Applications must listen for incoming requests
  3. The Dockerfile
  4. The Docker build command
  5. Next step

Introduction

Since the first stage of my GitLab pipeline is the package stage, I'll start this third part by creating the Dockerfile to complete the stage and pass to the next deploy stage.

Before that, I must take a step aside and talk about some specifics about application examples. This will give a full explanation of how every piece of the setup connects to each other and to understand the functional scope of each part.

As a reminder, for the purpose of this experiment I've created Node.js, PHP, Python and Ruby web applications. These are the applications I'll talk about in the next section.


1. Applications must listen for incoming requests

At the end of part I, I've made a brief description of how traffic flows from the client to the application:

✓ 1.client DNS ok and 443/TCP port open
✓ 2.host k0s installed
✓ 3.ingress ingress-nginx installed
4.service
5.pod
6.container
7.application

The last component at the end of this diagram, the 7.application, represents not only the code inside the container but the process that is running this code, and listens for incoming connections.

To do so, every application must implement these two requirements:

  • The application must listen on port 3000/TCP.
  • It must listen on 0.0.0.0 instead of localhost.

The Node.js implementation is done like this in the app.js file:

const host = "0.0.0.0";
const port = 3000;
require("http")
.createServer((req, res) => {
...
})
.listen(port, host, () => {

Likewise, the Python implementation in the app.py file:

hostName = "0.0.0.0"
serverPort = 3000
...
if __name__ == "__main__":
webServer = HTTPServer((hostName, serverPort), Server)
...
try:
webServer.serve_forever()

The Ruby implementation in the app.rb file:

server = TCPServer.new 3000

For the PHP application, incoming connections are handled by the PHP command line and its built-in web server, see in the Dockerfile:

CMD ["-S", "0.0.0.0:3000", "app.php"]

In the real world, traffic is not necessarily handled this way and a dedicated web server such as nginx might stand in front and act as a reverse proxy, forwarding requests to an event-driven server, instead of a process-based server, that runs the application code.

For instance, requests are usually forwarded by an nginx or Apache server and handled by a php-fpm server for PHP, a unicorn server for Ruby, and gunicorn for Python.

Outside of the hardcoded port value, the Node.js implementation can be done this way, though, because Node.js has a built-in event-driven webserver and Kubernetes will act as the process manager, a task that is usually delegated to PM2.


2. The Dockerfile and the docker build command

Docker images are built in the package stage of the pipeline with the following command:

$ docker build -t ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA} \
--build-arg COMMIT_SHORT_HASH=${CI_COMMIT_SHORT_SHA} .

Dockerfiles are almost identical throughout repositories. Given the Node.js example, the Dockerfile contains the following instructions:

1
# define the base system
2
FROM node:16-slim
3
4
# read value of COMMIT_SHORT_HASH passed with --build-arg
5
ARG COMMIT_SHORT_HASH
6
7
# copy COMMIT_SHORT_HASH value to COMMIT variable
8
ENV COMMIT $COMMIT_SHORT_HASH
9
10
# copy the GitLab repository into the image
11
COPY . /src
12
13
# move the current working directory to repository root
14
WORKDIR /src
15
16
# define the default program executed when running the image
17
ENTRYPOINT [ "node" ]
18
19
# define arguments passed to the default program
20
CMD [ "app.js" ]

There are a lot of things to explain here, but first:

  • The term build-time refers to the moment the docker build command is executed.

  • The term run-time refers to the moment the docker run command executed, or when a container has been deployed to Kubernetes.

About build-time variables

  • Variables are passed at build-time with the --build-arg flag

  • They can be read with the ARG instruction as in the Dockerfile at line 5.

  • They are not persisted at run-time.

  • To persist a build-time variable at run-time, its value must be copied to another variable as it's done with the ENV instruction in the Dockerfile at line 8.

The following command illustrates this behaviour by dumping all variables with the printenv command, COMMIT_SHORT_HASH doesn't exist but COMMIT does and contains the copied value:

$ docker run --entrypoint printenv \
${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA}
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=e63a3f3ff400
COMMIT=7c77eb36
NODE_VERSION=16.13.0
YARN_VERSION=1.22.15
HOME=/root

About run-time variables

  • Variables are passed at run-time with the -e flag.

  • If the value was already set at build-time with an ENV instruction in the Dockerfile, it is overwritten.

In the following example, I'm overwriting with the -e flag at run-time the COMMIT variable that has been set at build-time with the ENV instruction at line 8 of the Dockerfile,

$ docker run --entrypoint printenv \
-e COMMIT="A different value" \
${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA}
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=e63a3f3ff400
COMMIT=A different value
NODE_VERSION=16.13.0
YARN_VERSION=1.22.15
HOME=/root

Regarding the FROM, COPY and RUN instructions

  • FROM, COPY and RUN instructions cannot be at run-time since they exist only at build-time to construct the image's file system that will be mounted in the container at run-time.

  • Changing the content of an image can be done with the docker commit command. I definitely don't recommend using this command, nor to create a workflow that integrates such practice. Images should be reproducible, meaning they should be built from a Dockerfile only.

Regarding the WORKDIR, ENTRYPOINT and CMD instructions

  • WORKDIR and ENTRYPOINT can also be at run-time with --workdir or -w, and --entrypoint flags respectively.

    Though, it is unlikely to happen since an image is usually built to run the command set in the entrypoint.

    A valid case would be to switch the entrypoint from node to npm for instance.

  • CMD can also be at run-time, and is more likely to be to pass arguments to the application, when environmental variables cannot be used.

    Another valid case to overwrite the CMD instruction would be if the ENTRYPOINT instruction is also .

For instance, the following command will overwrite most instructions of the Dockerfile:

$ docker run --rm \
-e COMMIT="a different value" \ # overwrite line 8
--workdir /home \ # overwrite line 14
--entrypoint sh \ # overwrite line 17
${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA} \
-c "echo \$COMMIT" # overwrite line 20

3. The docker build command

I use the shortened Git commit hash as the image tag to identify what code an image contains:

$ docker build -t ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA} .

I also pass this value to the Docker build command with the --build-arg flag so that I can copy it to an environmental variable as explained in the previous section:

$ docker build -t ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA} \
--build-arg COMMIT_SHORT_HASH=${CI_COMMIT_SHORT_SHA} .
$ docker push ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHORT_SHA}

The video below shows the whole package stage running:


4. Next step

Images have been built and pushed to the Container Registry. The last missing configurations are Kubernetes manifests to allow the deploy stage to deploy applications to the Kubernetes cluster with kubectl.

A Vercel-like PaaS beyond Jamstack with Kubernetes and GitOps, part IV: Kubernetes manifests


About me

Me

Hi, I'm Jonathan Experton.

I help companies start, plan, execute and deliver software development projects on time, on scope and on budget.

Montreal, Canada · GMT -4