Extending the Pipeline¶
Gems and Jewels to Collect¶
At the end of this episode you will have a CI pipeline that encompasses a few common CI use cases that you could also apply for your CI pipelines in your own projects. Additional GitLab CI keywords will be explained, such as:
- Conditional execution of CI jobs with
rules
, - create, store and access artifacts with
artifacts
, - reuse artifacts created in previous CI jobs with
dependencies
.
Introduction¶
In this episode you will extend the CI pipeline we elaborated in the last episode while explaining the following CI use cases we introduced previously:
- Checking the license compliance,
- checking the code style of the project,
- testing against multiple Python versions.
We also dive deeper into the keyword stages
and introduce new keywords
like rules
, artifacts
and dependencies
and a list of selected
predefined GitLab CI variables.
Additional CI Use Cases to Extend the CI Pipeline¶
Before we approach the topic of optimizing the CI pipeline a few further very common CI use cases are missing in our CI pipeline.
Checking the License Compliance¶
We will develop a CI job that checks that all files contain license and
copyright information and that all license texts of the licenses used are
contained in the project.
First, we need to tell GitLab CI to run the CI job in a particular stage like
lint
that you need to declare at the beginning in your YAML file:
In the context of checking the license compliance the command of the
CLI tool
Reuse
is reuse lint
.
Since we are working with Python’s virtual environments we need to prefix
the command with poetry run
so that reuse is executed in that virtual
environment.
Now, we are ready to write down the corresponding CI job:
In our final .gitlab-ci.yml
file the complete job may look like this:
license-compliance:
image: python:3.11
stage: lint
before_script:
- pip install --upgrade pip
- pip install poetry
- poetry install
script:
- poetry run reuse lint
Checking the Code Style of the Project¶
Code style checking (or linting) should also always be part of your coding
projects and can be done automatically in CI pipelines.
Black
and
Isort
are recommandable tools to do that in the Python universe.
The respective commands are then black --check --diff .
and
isort --check --diff .
.
The first approach would be to copy and paste the previous lint job and
exchange the tasks in the script
keyword:
my-ci-job:
stage: lint
script:
- poetry run black --check --diff .
- poetry run isort --check --diff .
Our second lint job can then be added to the CI pipeline:
codestyle:
image: python:3.11
stage: lint
before_script:
- pip install --upgrade pip
- pip install poetry
- poetry install
script:
- poetry run black --check --diff .
- poetry run isort --check --diff .
As you can see, because of our copy and paste approach we introduce quite a bit of duplications. We will adapt the CI pipeline and reduce some duplications again in later episodes.
Testing Against Multiple Python Versions¶
Testing is the most important task that needs to be automated in CI pipelines.
Your test suite ensures that you do not break anything if you push your
changes to the repository.
This safety net is essential for coding projects to reduce the risk of having
defects in your code.
Pytest
is a unit-test framework for Python projects.
You may execute your test suite with the command pytest tests/
.
On top, you can create CI jobs each testing your application with different
versions of the Python interpreter.
But first, we need an additional stage called test
to run the test suite:
Now, you can duplicate a previous job, assign the jobs to stage test
and
adapt the image
keyword accordingly:
my-ci-job-1:
image: python:3.10
stage: test
script:
- poetry run pytest tests/
my-ci-job-2:
image: python:3.11
stage: test
script:
- poetry run pytest tests/
my-ci-job-3:
image: python:3.12
stage: test
script:
- poetry run pytest tests/
The full jobs in all detail look like this in our example:
test:python:3.10:
image: python:3.10
stage: test
before_script:
- pip install --upgrade pip
- pip install poetry
- poetry install
script:
- poetry run pytest tests/
test:python:3.11:
image: python:3.11
stage: test
before_script:
- pip install --upgrade pip
- pip install poetry
- poetry install
script:
- poetry run pytest tests/
test:python:3.12:
image: python:3.12
stage: test
before_script:
- pip install --upgrade pip
- pip install poetry
- poetry install
script:
- poetry run pytest tests/
Again, this introduces quite a bit of repetitions which we tackle in follow-up episodes.
Additional Concepts and GitLab CI Keywords¶
In this section we would like to discuss more concepts and keywords that you may want to use in your projects.
More About Stages and Jobs¶
Now that we created our first complete CI pipeline covering all of our CI
use cases, let us inspect our CI pipeline and the three stages and six CI jobs
we defined.
We observed that those stages are executed in sequence, i.e. jobs of later
stages run only if the previous stage completed successfully.
Those testing jobs in the test
stage run in parallel, though.
This is possible because all jobs in stage test
are independent of each other.
We recommend running jobs in parallel in a stage if the independence criterion
holds true, because parallelization speeds up the pipeline significantly.
In later episodes we will learn how to change this default behaviour with the
needs
keyword and change the running order of CI jobs.
Also, we will further speed up the CI pipeline with some additional concepts.
Predefined Variables in GitLab CI¶
Predefined variables in GitLab CI are variables in the context of GitLab CI which have useful values assigned. They can be used in GitLab CI pipelines.
Predefined Variables Reference¶
This is a compilation of few selected CI variables:
Variable Name | Description |
---|---|
CI_COMMIT_BRANCH |
The commit branch name. Available in branch pipelines, including pipelines for the default branch. Not available in merge request pipelines or tag pipelines. |
CI_COMMIT_REF_NAME |
The branch or tag name for which project is built. |
CI_COMMIT_REF_SLUG |
CI_COMMIT_REF_NAME in lowercase, shortened to 63 bytes, and with everything except 0-9 and a-z replaced with -. No leading / trailing -. Use in URLs, host names and domain names. |
CI_COMMIT_SHA |
The commit revision the project is built for. |
CI_COMMIT_TAG |
The commit tag name. Available only in pipelines for tags. |
CI_DEFAULT_BRANCH |
The name of the project’s default branch. |
CI_DEPLOY_PASSWORD |
The authentication password of the GitLab Deploy Token, if the project has one. |
CI_DEPLOY_USER |
The authentication username of the GitLab Deploy Token, if the project has one. |
CI_JOB_TOKEN |
A token to authenticate with certain API endpoints. The token is valid as long as the job is running. |
CI_PROJECT_DIR |
The full path the repository is cloned to, and where the job runs from. |
CI_REGISTRY_IMAGE |
The address of the project’s Container Registry. Only available if the Container Registry is enabled for the project. |
CI_REGISTRY_PASSWORD |
The password to push containers to the project’s GitLab Container Registry. Only available if the Container Registry is enabled for the project. This password value is the same as the CI_JOB_TOKEN and is valid only as long as the job is running. Use the CI_DEPLOY_PASSWORD for long-lived access to the registry |
CI_REGISTRY_USER |
The username to push containers to the project’s GitLab Container Registry. Only available if the Container Registry is enabled for the project. |
CI_REGISTRY |
The address of the GitLab Container Registry. Only available if the Container Registry is enabled for the project. This variable includes a :port value if one is specified in the registry configuration. |
CI_REPOSITORY_URL |
The URL to clone the Git repository. |
Predefined Variables for Merge Request Pipelines¶
On top, this is a compilation of few selected CI variables that are present in merge request pipelines only:
Variable Name | Description |
---|---|
CI_MERGE_REQUEST_SOURCE_BRANCH_NAME |
The source branch name of the merge request. |
CI_MERGE_REQUEST_SOURCE_BRANCH_SHA |
The HEAD SHA of the source branch of the merge request. The variable is empty in merge request pipelines. The SHA is present only in merged results pipelines. |
CI_MERGE_REQUEST_TARGET_BRANCH_NAME |
The target branch name of the merge request. |
CI_MERGE_REQUEST_TARGET_BRANCH_SHA |
The HEAD SHA of the target branch of the merge request. The variable is empty in merge request pipelines. The SHA is present only in merged results pipelines. |
Example¶
In order to show how these predefined variables can be used inside your CI pipeline, we give this example that just outputs the values of two predefined CI variables that we need in the next section of this episode:
stages:
- echo
echo-job:
stage: echo
script:
- echo "CI_COMMIT Branch = '$CI_COMMIT_BRANCH'"
- echo "CT_DEFAULT_BRANCH = '$CI_DEFAULT_BRANCH'"
This is the output appearing in the CI job log of job echo
:
[...]
$ echo "CI_COMMIT Branch = '$CI_COMMIT_BRANCH'"
CI_COMMIT Branch = 'main'
$ echo "CT_DEFAULT_BRANCH = '$CI_DEFAULT_BRANCH'"
CT_DEFAULT_BRANCH = 'main'
[...]
Conditional Execution of CI Jobs With rules
¶
It might be the case that you do not need to execute a CI job in all pipeline
runs but in pipelines that fulfil certain conditions.
A useful keyword is the
rules
keyword
when it comes to executing CI jobs conditionally.
The keyword is quite powerful but in our opinion also a bit harder to
understand.
Here we introduce the most common rule, i.e. execute a job if the pipeline
has been triggered due to a merge into branch main.
Taken the run job of our pipeline this looks like this:
running:
image: python:3.11
stage: run
before_script:
- pip install --upgrade pip
- pip install poetry
- poetry install
script:
- poetry run python -m astronaut_analysis
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
As a consequence, the run
job which created a new set of plots is only
executed if the branch at hand which we commit into during a merge
is the default branch, i.e. branch main
in our case.
Variable $CI_COMMIT_BRANCH
holds the branch name which we commit into during
a merge.
Variable $CI_DEFAULT_BRANCH
holds the default branch name, i.e. main
,
in this project.
Running this job only conditionally might be reasonable because we only want
to generate plots originating from default branch main
.
Create, Store and Access Artifacts With artifacts
¶
You might have asked yourself whether we could access artifacts generated
during a CI job.
Fortunately, this is possible with the
artifacts
keyword.
We need to specify the artifacts retained from a CI job as a list of files
and directories like this:
running:
image: python:3.11
stage: run
before_script:
- pip install --upgrade pip
- pip install poetry
- poetry install
script:
- poetry run python -m astronaut_analysis
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
artifacts:
paths:
- results/
After the job completed the plots are stored for a period of 30 days as job artifacts. In case of so-called latest artifacts they won’t be deleted until newer artifacts arrive. You can access them and, for example, download them by navigating into the CI job log of your CI job and click download in the job artifacts section on the right sidebar.
Reuse Artifacts Created in Previous CI Jobs With dependencies
¶
What if we have generated some artifacts in a previous CI job, do we need to
re-generate the artifacts already created in a later CI job if we need them?
No, of course it is possible to pass artifacts from one job on to a later
CI job.
The respective keyword is the
dependencies
keyword.
You can tell the CI pipeline to fetch the job artifacts of a previous CI job:
stages:
- run
- deploy
running:
image: python:3.11
stage: run
before_script:
- pip install --upgrade pip
- pip install poetry
- poetry install
script:
- poetry run python -m astronaut_analysis
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
artifacts:
paths:
- results/
pages:
stage: deploy
script:
- mkdir public/
- cp results/age_histogram.png public/age_histogram.png
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
artifacts:
paths:
- public/
dependencies:
- run
Note
This special
pages
CI job
running on changes on branch main
needs some explanations.
In GitLab you can host internal static web pages containing files such as
HTML, Javascript or CSS files.
There is a special CI job called pages
that deploys your static web page to
GitLab.
During a pipeline run you need to copy your generated page into the public
folder and name it in the artifacts section of the CI job pages
.
The pages
job will then take all contained files and hosts them as a static
web page, if this feature is activated in the settings of your GitLab project.
To activate GitLab Pages you can navigate to
Settings > General > Visibility, Project Features, Permissions
and enable the Pages feature.
After the first pipeline run you can find the URL of your static web page
in the settings of the project:
Settings > Pages.
All logged in GitLab users can access these Pages then.
It is also possible to make these Pages private and accessible by project
members only.
Exercise
Exercise 1: Create a Complete CI Pipeline for the Exercise Project¶
By now we have introduced some keywords and concepts that are useful in covering all CI use cases discussed so far. In the following exercise you should try to develop a CI pipeline for the exercise project which includes all CI use-cases from the previous exercise. These were:
- Check license compliance.
- Linting the source code.
- Building the executable.
- Run existing test cases.
- Run the executable.
The pipeline might contain jobs like licence_compliance
, lint
, build
,
test
and run
.
To get you started, these are the relevant commands for the script
section
of the CI jobs:
- License compliance can be checked by the before-mentioned
reuse
tool:
reuse lint
- Linting can be done by a tool called
cpplint:
cpplint --recursive src/ tests/
- The build of the application is done with
CMake:
cmake -S . -B build
and cmake --build build
- The test suite can be run by
GoogleTest:
cd build && ctest
- Finally, we want to run the application on the command-line without any
arguments:
./build/bin/helloWorld
Take Home Messages
In this episode we explored some additional common CI use cases like linting
and testing and introduced new GitLab CI keywords like rules
, artifacts
and dependencies
and listed a few predefined GitLab CI variables.
Next Episodes¶
Next, we will take the CI pipeline we wrote so far and optimize and polish it a bit so that it is easier to read, much easier to maintain and runs more efficiently and faster.