Skip to content

Creating a Container

When starting to work with containers you will soon notice that existing images may not always satisfy your needs. In these situations you want to create your own custom image.

Images are defined by a text file called Dockerfile. Dockerfiles contain the instructions for Docker / Podman how to create a custom image as the basis for containers.

Let's build and run our first image

We start by creating a text file called Dockerfile in the folder ~/using-containers-in-science/.

cd ~
mkdir using-containers-in-science
cd using-containers-in-science
nano Dockerfile

Now, we add the content below into the Dockerfile:

FROM python:3.12
LABEL maintainer="support@hifis.net"

RUN pip install --upgrade pip
RUN pip install ipython numpy

ENTRYPOINT ["ipython"]

After that we can save and leave the editor (In the case of nano: Ctrl+O then Ctrl+X). Congratulations, it is that simple. The image can be built using the podman build command as shown below.

Note that to build a custom image, you have to be in the folder containing the Dockerfile. The latter is implicitly used as the input for the build, and you have to specify the name of the image to be built.

podman build -t my-ipython-image .
Output
STEP 1/5: FROM python:3.12
Resolved "python" as an alias (/home/christianhueser/.cache/containers/short-name-aliases.conf)
Trying to pull docker.io/library/python:3.12...
Getting image source signatures
Copying blob 63941d09e532 skipped: already exists  
Copying blob 567db630df8d skipped: already exists  
Copying blob 5f899db30843 skipped: already exists  
Copying blob d68cd2123173 skipped: already exists  
Copying blob 3cb8f9c23302 skipped: already exists  
Copying blob 097431623722 skipped: already exists  
Copying blob 09527fa4de8d skipped: already exists  
Copying blob 71215d55680c skipped: already exists  
Copying config 6cbe1053f2 done  
Writing manifest to image destination
Storing signatures
STEP 2/5: LABEL maintainer="support@hifis.net"
--> 073707692c4
STEP 3/5: RUN pip install --upgrade pip
Requirement already satisfied: pip in /usr/local/lib/python3.12/site-packages (24.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
--> 910ba09337d
STEP 4/5: RUN pip install ipython numpy
Collecting ipython
  Downloading ipython-8.23.0-py3-none-any.whl.metadata (4.9 kB)
Collecting numpy
  Downloading numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.0/61.0 kB 636.2 kB/s eta 0:00:00
Collecting decorator (from ipython)
  Downloading decorator-5.1.1-py3-none-any.whl.metadata (4.0 kB)
Collecting jedi>=0.16 (from ipython)
  Downloading jedi-0.19.1-py2.py3-none-any.whl.metadata (22 kB)
Collecting matplotlib-inline (from ipython)
  Downloading matplotlib_inline-0.1.6-py3-none-any.whl.metadata (2.8 kB)
Collecting prompt-toolkit<3.1.0,>=3.0.41 (from ipython)
  Downloading prompt_toolkit-3.0.43-py3-none-any.whl.metadata (6.5 kB)
Collecting pygments>=2.4.0 (from ipython)
  Downloading pygments-2.17.2-py3-none-any.whl.metadata (2.6 kB)
Collecting stack-data (from ipython)
  Downloading stack_data-0.6.3-py3-none-any.whl.metadata (18 kB)
Collecting traitlets>=5.13.0 (from ipython)
  Downloading traitlets-5.14.2-py3-none-any.whl.metadata (10 kB)
Collecting pexpect>4.3 (from ipython)
  Downloading pexpect-4.9.0-py2.py3-none-any.whl.metadata (2.5 kB)
Collecting parso<0.9.0,>=0.8.3 (from jedi>=0.16->ipython)
  Downloading parso-0.8.3-py2.py3-none-any.whl.metadata (7.5 kB)
Collecting ptyprocess>=0.5 (from pexpect>4.3->ipython)
  Downloading ptyprocess-0.7.0-py2.py3-none-any.whl.metadata (1.3 kB)
Collecting wcwidth (from prompt-toolkit<3.1.0,>=3.0.41->ipython)
  Downloading wcwidth-0.2.13-py2.py3-none-any.whl.metadata (14 kB)
Collecting executing>=1.2.0 (from stack-data->ipython)
  Downloading executing-2.0.1-py2.py3-none-any.whl.metadata (9.0 kB)
Collecting asttokens>=2.1.0 (from stack-data->ipython)
  Downloading asttokens-2.4.1-py2.py3-none-any.whl.metadata (5.2 kB)
Collecting pure-eval (from stack-data->ipython)
  Downloading pure_eval-0.2.2-py3-none-any.whl.metadata (6.2 kB)
Collecting six>=1.12.0 (from asttokens>=2.1.0->stack-data->ipython)
  Downloading six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
Downloading ipython-8.23.0-py3-none-any.whl (814 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 814.2/814.2 kB 1.2 MB/s eta 0:00:00
Downloading numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.0/18.0 MB 1.6 MB/s eta 0:00:00
Downloading jedi-0.19.1-py2.py3-none-any.whl (1.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 3.5 MB/s eta 0:00:00
Downloading pexpect-4.9.0-py2.py3-none-any.whl (63 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.8/63.8 kB 9.3 MB/s eta 0:00:00
Downloading prompt_toolkit-3.0.43-py3-none-any.whl (386 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 386.1/386.1 kB 3.7 MB/s eta 0:00:00
Downloading pygments-2.17.2-py3-none-any.whl (1.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 2.4 MB/s eta 0:00:00
Downloading traitlets-5.14.2-py3-none-any.whl (85 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.4/85.4 kB 2.1 MB/s eta 0:00:00
Downloading decorator-5.1.1-py3-none-any.whl (9.1 kB)
Downloading matplotlib_inline-0.1.6-py3-none-any.whl (9.4 kB)
Downloading stack_data-0.6.3-py3-none-any.whl (24 kB)
Downloading asttokens-2.4.1-py2.py3-none-any.whl (27 kB)
Downloading executing-2.0.1-py2.py3-none-any.whl (24 kB)
Downloading parso-0.8.3-py2.py3-none-any.whl (100 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.8/100.8 kB 10.0 MB/s eta 0:00:00
Downloading ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB)
Downloading pure_eval-0.2.2-py3-none-any.whl (11 kB)
Downloading wcwidth-0.2.13-py2.py3-none-any.whl (34 kB)
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: wcwidth, pure-eval, ptyprocess, traitlets, six, pygments, prompt-toolkit, pexpect, parso, numpy, executing, decorator, matplotlib-inline, jedi, asttokens, stack-data, ipython
Successfully installed asttokens-2.4.1 decorator-5.1.1 executing-2.0.1 ipython-8.23.0 jedi-0.19.1 matplotlib-inline-0.1.6 numpy-1.26.4 parso-0.8.3 pexpect-4.9.0 prompt-toolkit-3.0.43 ptyprocess-0.7.0 pure-eval-0.2.2 pygments-2.17.2 six-1.16.0 stack-data-0.6.3 traitlets-5.14.2 wcwidth-0.2.13
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
--> 64348d5728e
STEP 5/5: ENTRYPOINT ["ipython"]
COMMIT my-ipython-image
--> 0b68cd1cf29
Successfully tagged localhost/my-ipython-image:latest
0b68cd1cf29fc16a196e1a80f24282adeb6b3c86d31e03a36ae6799553b0922b

Let's try out the newly created image by running it.

podman run --rm -it my-ipython-image

Output

Python 3.12.2 (main, Mar 12 2024, 11:02:14) [GCC 12.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.23.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]:

We end up in an IPython shell allowing us to interact like in an IPython shell installed in the usual manner. Once we exit the shell, the container also stops running. Let's see how this works by disassembling the Dockerfile.

Disassembling the Dockerfile

The Dockerfile used above contains four different types of instructions:

  • FROM <image>
  • Sets the base image for the instructions below.
  • Each valid Dockerfile must start with a FROM instruction.
  • The image can be any valid image, e.g. from public registries. > Please note: Choose a trusted base image for your images. > We'll cover that topic in more detail in lesson 6 of this course.
  • LABEL <key>=<value> <key>=<value> <key>=<value> ...
  • The LABEL instruction adds metadata to the image.
  • A LABEL is a key-value pair.
  • This is typically used to provide information about e.g. the maintainer of an image.
  • RUN <command>
  • The RUN instruction executes any command on top of the current image. (We will cover this in a minute.)
  • The resulting image will be used as the base for the next step in the Dockerfile.
  • ENTRYPOINT ["executable", "param1", "param2"]
  • An ENTRYPOINT allows you to configure a container that runs as an executable.
  • Command line arguments to podman run <image> will be appended after all elements in the exec form ENTRYPOINT.

Example

podman run --rm -it my-ipython-image --version

Will give us the version number of IPython. This is equivalent to executing ipython --version, locally.

8.23.0

Let's build the image again and see what happens.

podman build -t my-ipython-image .

Output

STEP 1/5: FROM python:3.12
STEP 2/5: LABEL maintainer="support@hifis.net"
--> Using cache 073707692c48409d01057b24ee89ec07bbcb08c530948e1e972389b525d1dbf9
--> 073707692c4
STEP 3/5: RUN pip install --upgrade pip
--> Using cache 910ba09337d65dd4537dd02d9049d5265466ae44a55f596a7a32bcf41d18da5c
--> 910ba09337d
STEP 4/5: RUN pip install ipython numpy
--> Using cache 64348d5728e9c55850dc6aa7311757e2b34ab5441e544ebfae011f8bd6240028
--> 64348d5728e
STEP 5/5: ENTRYPOINT ["ipython"]
--> Using cache 0b68cd1cf29fc16a196e1a80f24282adeb6b3c86d31e03a36ae6799553b0922b
COMMIT my-ipython-image
--> 0b68cd1cf29
Successfully tagged localhost/my-ipython-image:latest
0b68cd1cf29fc16a196e1a80f24282adeb6b3c86d31e03a36ae6799553b0922b

This time, the output is much shorter than in our initial run of the podman build command. In each of the steps it is claimed to have used the cache. As each instruction is executed, Podman looks for an existing image in its cache that has already been created in the same manner. If there is such an image, Podman will re-use that image instead of creating a duplicate. If you do not want Podman to use its cache, provide the --no-cache=true option to the podman build command.

Task: Create and Run a Data Science Image

Task Description

Your goal in this exercise is to create your own custom data science image as follows:

  1. Build your image on top of the latest Python image of release series 3.12.
  2. Mark yourself as the maintainer of the image.
  3. Install numpy, scipy, pandas, scikit-learn and jupyterlab using pip install.
  4. Create a custom user using the command useradd -ms /bin/bash jupyter.
  5. Tell the image to automatically start as the jupyter user and to use the working directory /home/jupyter.
  6. Make sure the image starts with the command jupyter lab --ip=0.0.0.0 by default.

Hint: Use the instructions USER and WORKDIR for task 5.

When having built the image, make sure to test it by running it and opening jupyter in your browser. You should be able to execute any command now, e.g.

import numpy as np
np.__config__.show()
Solution
  • Create a Dockerfile with below content.
FROM python:3.12

RUN pip install ipython jupyterlab numpy pandas scikit-learn

# Create a custom user under which the application runs
RUN useradd -ms /bin/bash jupyter

# Use this user by default for all subsequent operations
USER jupyter
# Default to start the container in the home directory of the jupyter user
WORKDIR /home/jupyter

# Publish port 8888 to the outside, for documentation purpose
EXPOSE 8888

ENTRYPOINT ["jupyter", "lab", "--ip=0.0.0.0"]
  • Build the image.
podman build -t my-datascience-image .
  • Run the image and bind port 8888.
podman run -p 8888:8888 -it --rm my-datascience-image

This yields an output as shown below. (Details may vary)

Output
[I 2024-04-01 14:19:55.406 ServerApp] jupyter_lsp | extension was successfully linked.
[I 2024-04-01 14:19:55.409 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2024-04-01 14:19:55.413 ServerApp] jupyterlab | extension was successfully linked.
[I 2024-04-01 14:19:55.414 ServerApp] Writing Jupyter server cookie secret to /home/jupyter/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2024-04-01 14:19:55.663 ServerApp] notebook_shim | extension was successfully linked.
[I 2024-04-01 14:19:55.678 ServerApp] notebook_shim | extension was successfully loaded.
[I 2024-04-01 14:19:55.680 ServerApp] jupyter_lsp | extension was successfully loaded.
[I 2024-04-01 14:19:55.681 ServerApp] jupyter_server_terminals | extension was successfully loaded.
[I 2024-04-01 14:19:55.682 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.12/site-packages/jupyterlab
[I 2024-04-01 14:19:55.682 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
[I 2024-04-01 14:19:55.682 LabApp] Extension Manager is 'pypi'.
[I 2024-04-01 14:19:55.709 ServerApp] jupyterlab | extension was successfully loaded.
[I 2024-04-01 14:19:55.710 ServerApp] Serving notebooks from local directory: /home/jupyter
[I 2024-04-01 14:19:55.710 ServerApp] Jupyter Server 2.13.0 is running at:
[I 2024-04-01 14:19:55.710 ServerApp] http://b9d43e460483:8888/lab?token=54269fbf8af372736a456bc27e53f3de8cb168031152b294
[I 2024-04-01 14:19:55.710 ServerApp]     http://127.0.0.1:8888/lab?token=54269fbf8af372736a456bc27e53f3de8cb168031152b294
[I 2024-04-01 14:19:55.710 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 2024-04-01 14:19:55.717 ServerApp] No web browser found: Error('could not locate runnable browser').
[C 2024-04-01 14:19:55.717 ServerApp] 

    To access the server, open this file in a browser:
        file:///home/jupyter/.local/share/jupyter/runtime/jpserver-1-open.html
    Or copy and paste one of these URLs:
        http://b9d43e460483:8888/lab?token=54269fbf8af372736a456bc27e53f3de8cb168031152b294
        http://127.0.0.1:8888/lab?token=54269fbf8af372736a456bc27e53f3de8cb168031152b294
[I 2024-04-01 14:19:55.737 ServerApp] Skipped non-installed server(s): bash-language-server, dockerfile-language-server-nodejs, javascript-typescript-langserver, jedi-language-server, julia-language-server, pyright, python-language-server, python-lsp-server, r-languageserver, sql-language-server, texlab, typescript-language-server, unified-language-server, vscode-css-languageserver-bin, vscode-html-languageserver-bin, vscode-json-languageserver-bin, yaml-language-server