Containers Digression: How to Build and Push Images in a CI Pipeline?¶
Gems and Jewels to Collect¶
In this episode we will demonstrate how to build a custom container image given an arbitrary Dockerfile and how to push it to a container image registry during a CI pipeline.
Introduction¶
There are a couple of good reasons to build your own custom container images used in your CI jobs and push them to a container image registry like the Helmholtz Codebase container registry. The most important reason with regards to GitLab CI is pipeline performance. You could customize your containers also during the CI job run and install dependencies at run-time, which would mean to install these dependencies from scratch in your CI pipeline definition. But for performance reasons it can be beneficial to build your own custom container images that already contain all your dependencies. By doing so, you most probably experience a great boost in pipeline performance, because setting up the container takes much less time now.
It needs to be mentioned that this also comes with a drawback and that is the maintenance of the custom container images. Dockerfiles, the blueprints of container images, need to be maintained and kept up-to-date which involves checking your dependencies for new releases, testing these new releases in your use-case, updating the Dockerfiles, and building new container images and pushing them into a Container Image Registry. With the help of bots like Dependabot and Renovate the checks for new releases can be automated more completely, but depending on the degree of automation manual maintenance steps might remain.
To increase the degree of automation even further you might want to build and push your custom container images in a CI pipeline. If these custom container images are then used in your CI jobs you will benefit from faster CI jobs. In general, it is advisable to optimize your CI pipeline with regard to speed. This practice adheres to the good practices in Continuous Integration mentioned in the beginning, because this results in much faster feedback loops for you which accelerates your whole development process in the end. We will explain this automation process and performance optimization in this episode.
Demo Project - Astronaut Container¶
In order to exemplify the steps to be implemented to build and push a
container image we prepared a
demo project “Astronaut Container”.
It contains a
Dockerfile
and a
.gitlab-ci.yml
file.
The Dockerfile specifies the base image as a Python image, with the version defined as a variable that can be passed as an argument during the build process. By default, the Python image version is set to 3.12 using the ARG keyword. This variable is then used to specify the image tag of the base image. Additionally, the Dockerfile installs extra dependencies from the Python Package Index (PyPI) using pip.
The CI pipeline then builds the images for various Python versions and pushes them to the Container Image Registry.
The Dockerfile¶
While the primary focus is on GitLab CI in this course, understanding Dockerfile basics is still essential. We’ll cover its fundamentals for completeness.
The whole Dockerfile looks like this:
ARG PYTHON_VERSION=3.12
FROM python:${PYTHON_VERSION}
LABEL maintainer="HIFIS <support@hifis.net>"
RUN pip install --upgrade pip \
&& pip install poetry
We will explain it line by line now:
This line defines a variable named PYTHON_VERSION
and sets its default
value to 3.12
.
You can think of it as a placeholder that can be changed when you build
the Docker image.
This placeholder is then replaced by a value that we give in the build
command to pass it on to the build process that builds a container
image from the Dockerfile.
This line is specifying the base image for your custom container image. We start with the official Python image and use the version we set earlier.
This line is adding metadata to your Docker image. It’s like putting a name tag on your container, saying who maintains it. This is considered a good practice.
This line is running commands inside the Docker container to install
some software.
Here, we upgrade pip
and install poetry
, which is another tool for
managing Python projects.
The CI Pipeline¶
The CI pipeline contains two jobs:
- The
build
job builds container images for different Python versions and runs in the test stage, but only for non-default branches. - The
build_and_push
job builds container images for different Python versions and pushes them to the container registry. It runs in the deploy stage, but only for the default branch.
This setup ensures that Docker images are built and tested for different Python versions, and only the images from the default branch are pushed to the container registry.
Build Container Images in GitLab CI - Docker-in-Docker (DinD)¶
Options to build container images in a CI pipeline
Various options exist to build container images in GitLab CI:
- Docker-in-Docker:
Run a Docker daemon inside a Docker container to build images.
This requires that your CI runners use
privileged
mode. - Kaniko, buildah: Useful to build container images without a Docker daemon.
We chose Docker-in-Docker since the Helmholtz Codebase runners
with the docker
tag run in privileged
mode.
Let’s break the pipeline definition apart.
For this particular CI pipeline we chose the latest Docker image
version 28
as the default image.
Our goal is to build a new Docker image inside this Docker container.
In order to build Docker images inside of other Docker containers we
need a concept that is called “Docker-in-Docker (DinD)”.
To do so we introduce a new keyword that is called
services
.
Services are other Docker containers that run in addition to the
usual Docker container and that are linked to the container
specified in the image
keyword.
In this particular case we would like to have a service container
that is used for Docke-in-Docker applications and that is tagged
with dind
, e.g. 28-dind
.
In total, the default
sections looks like this:
While doing this we gain the ability to build Docker images inside the service container.
The Build Job¶
Now we will explain the build
job in more detail.
First of all, let us extract parts of the build
job into a reusable
hidden job called .base_job
.
Here, we are doing three things:
- Select a GitLab Runner that is capable of building Docker images in a Docker container.
- Run three jobs in parallel with three different Python base image versions.
- Building the Docker image as part of the
script
section.
Select the GitLab Runner¶
Selecting GitLab Runners can be done per job with the
tags
keyword.
The respective runner tag for the GitLab Runner that is capable of
doing Docker-in-Docker builds is called docker
.
This particular runner is a
privileged
runner
that means it runs with specific permissions to be able to build
Docker images.
Run three jobs in parallel¶
As we already know running jobs in parallel from the same parameterized
job description can be done with the parallel:matrix
keyword and
by specifying a variable called PYTHON_VERSION
.
For this example we chose three different Python versions 3.11
,
3.12
, 3.13
to be passed on to the build command to build three
different images based on different Python images.
Building the Docker image¶
Within the script
section we are building the Docker image via the
docker build
command.
The command looks like this:
docker build --network=host --build-arg="PYTHON_VERSION=${PYTHON_VERSION}" --tag $CI_REGISTRY_IMAGE/astronaut-python:$PYTHON_VERSION .
We will now take its pieces apart and explain their meaning:
--network=host
:
The --network
option of the Docker command lets you specify the
network driver
for the container application.
In this case we chose host
instead of the default driver bridge
.
A bridge
network driver lets containers communicate with each other on the same host, while a host
network driver removes the network isolation between the container and the host and uses the host’s network.
This is done for performance reasons.
--build-arg="PYTHON_VERSION=${PYTHON_VERSION}"
:
Build arguments are variables inside the Dockerfile that can be changed
by the --build-args
argument of the build command.
We are passing the variable PYTHON_VERSION
with specific values to the
build command and the Dockerfile, respectively.
This will then replace the default placeholder value in the Dockerfile
given by the ARG
keyword.
--tag $CI_REGISTRY_IMAGE/astronaut-python:$PYTHON_VERSION
:
With the --tag
option we give the resulting Docker image a name
and a tag.
For the name we use an environment variable called
$CI_REGISTRY_IMAGE
,
which is formatted as: <host>[:<port>]/<project_full_path>
,
or more specifically: hcr.helmholtz.cloud/<project_full_path>
.
Additionally, we specify the name of the container image repository
as part of your Codebase GitLab project which is astronaut-python
.
By doing so we generally adhere to the
naming convention for container images.
.
:
The location of the Dockerfile is in the root directory of the project
that is why we write a .
as the last argument of the build command
to reference the current directory.
Extending the build
Job¶
Now, we can reuse the hidden job .base_job
in the build
job with
the extends
keyword and also specify the stage to run the job in
as well as the rule when it should run, in this case,
on each commit that is not pushed to the default branch, main
:
if: $CI_COMMIT_BRANCH != $CI_DEFAULT_BRANCH
.
The Job Definition¶
.base_job:
parallel:
matrix:
- PYTHON_VERSION: ["3.11", "3.12", "3.13"]
tags:
- docker
script:
- docker build --network=host --build-arg="PYTHON_VERSION=${PYTHON_VERSION}" --tag $CI_REGISTRY_IMAGE/astronaut-python:$PYTHON_VERSION .
build:
extends:
- .base_job
stage: test
rules:
- if: $CI_COMMIT_BRANCH != $CI_DEFAULT_BRANCH
The Build-and-Push Job¶
The build_and_push
job performs nearly the same tasks as before,
but now it triggers when a commit is pushed to the default branch,
main
, as determined by the rule
if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
.
Although this job runs in a later stage called deploy
,
the primary differences lie in the additional before_script
and after_script
sections that have been added.
Steps to Push Images to the Container Image Registry¶
- Log into the Container Image Registry (
before_script
) - Build the Container Image (
script
) - Push the Container Image to the Container Image Registry
(
after_script
)
Log into the Container Image Registry¶
Logging into the Container Image Registry is done with the
login
command:
Again, we can use
predefined variables
such as $CI_REGISTRY_USER
, $CI_REGISTRY_PASSWORD
, and
$CI_REGISTRY
.
No user-specific credentials are needed here.
Do not commit secrets into your repository
Caution: You must not commit your credentials or access tokens to the repository!
The last variable $CI_REGISTRY
contains the domain of the
container registry, in case of the Helmholtz Codebase:
hcr.helmholtz.cloud
.
Build the Container Image¶
Building the image is part of the hidden job .base_job
that we
can reuse here again.
Push the Container Image to the Container Image Registry¶
Pushing the container image to the Container Image Registry is done
with the push
command and by choosing which image to push:
The Full Pipeline Definition¶
Putting all this together, we get the full pipeline definition in
.gitlab-ci.yml
:
image: docker:28.0
services:
- docker:28.0-dind
.base_job:
parallel:
matrix:
- PYTHON_VERSION: ["3.11", "3.12", "3.13"]
tags:
- docker
script:
- docker build --network=host --build-arg="PYTHON_VERSION=${PYTHON_VERSION}" --tag $CI_REGISTRY_IMAGE/astronaut-python:$PYTHON_VERSION .
build:
extends:
- .base_job
stage: test
rules:
- if: $CI_COMMIT_BRANCH != $CI_DEFAULT_BRANCH
build_and_push:
extends:
- .base_job
stage: deploy
before_script:
- docker login --username $CI_REGISTRY_USER --password $CI_REGISTRY_PASSWORD $CI_REGISTRY
after_script:
- docker push $CI_REGISTRY_IMAGE/astronaut-python:$PYTHON_VERSION
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
Take Home Messages
In this digression on containers we learned how to build
container images with Docker-in-Docker using GitLab CI.
We also learned how to push the resulting image to the
Helmholtz Codebase container image registry.
All this is done in separate CI jobs of your CI pipeline.
The most important keywords are the services
keyword to
create further containers linked to the original one to be
able to build container images inside other containers
with Docker-in-Docker, and the tags
keyword to select
particular privileged runners that are capable of these
things with Docker-in-Docker.
The tags
keyword selects privileged runners with
elevated permissions.
Next Episodes¶
In the final episode of this workshop, we will discuss removing duplications and reusing parts of the CI pipeline.