Creating a Container
When starting to work with containers you will soon notice that existing images may not always satisfy your needs. In these situations you want to create your own custom image.
Images are defined by a text file called Dockerfile
.
Dockerfiles contain the instructions for Docker / Podman how to create a custom image as the basis for containers.
Let's build and run our first image
We start by creating a text file called Dockerfile
in the folder ~/using-containers-in-science/
.
cd ~
mkdir using-containers-in-science
cd using-containers-in-science
nano Dockerfile
Now, we add the content below into the Dockerfile
:
FROM python:3.12
LABEL maintainer="support@hifis.net"
RUN pip install --upgrade pip
RUN pip install ipython numpy
ENTRYPOINT ["ipython"]
After that we can save and leave the editor (In the case of nano: Ctrl+O
then Ctrl+X
).
Congratulations, it is that simple.
The image can be built using the podman build
command as shown below.
Note that to build a custom image, you have to be in the folder containing the Dockerfile
.
The latter is implicitly used as the input for the build, and you have to specify the name of the image to be built.
podman build -t my-ipython-image .
Output
STEP 1/5: FROM python:3.12
Resolved "python" as an alias (/home/christianhueser/.cache/containers/short-name-aliases.conf)
Trying to pull docker.io/library/python:3.12...
Getting image source signatures
Copying blob 63941d09e532 skipped: already exists
Copying blob 567db630df8d skipped: already exists
Copying blob 5f899db30843 skipped: already exists
Copying blob d68cd2123173 skipped: already exists
Copying blob 3cb8f9c23302 skipped: already exists
Copying blob 097431623722 skipped: already exists
Copying blob 09527fa4de8d skipped: already exists
Copying blob 71215d55680c skipped: already exists
Copying config 6cbe1053f2 done
Writing manifest to image destination
Storing signatures
STEP 2/5: LABEL maintainer="support@hifis.net"
--> 073707692c4
STEP 3/5: RUN pip install --upgrade pip
Requirement already satisfied: pip in /usr/local/lib/python3.12/site-packages (24.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
--> 910ba09337d
STEP 4/5: RUN pip install ipython numpy
Collecting ipython
Downloading ipython-8.23.0-py3-none-any.whl.metadata (4.9 kB)
Collecting numpy
Downloading numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.0/61.0 kB 636.2 kB/s eta 0:00:00
Collecting decorator (from ipython)
Downloading decorator-5.1.1-py3-none-any.whl.metadata (4.0 kB)
Collecting jedi>=0.16 (from ipython)
Downloading jedi-0.19.1-py2.py3-none-any.whl.metadata (22 kB)
Collecting matplotlib-inline (from ipython)
Downloading matplotlib_inline-0.1.6-py3-none-any.whl.metadata (2.8 kB)
Collecting prompt-toolkit<3.1.0,>=3.0.41 (from ipython)
Downloading prompt_toolkit-3.0.43-py3-none-any.whl.metadata (6.5 kB)
Collecting pygments>=2.4.0 (from ipython)
Downloading pygments-2.17.2-py3-none-any.whl.metadata (2.6 kB)
Collecting stack-data (from ipython)
Downloading stack_data-0.6.3-py3-none-any.whl.metadata (18 kB)
Collecting traitlets>=5.13.0 (from ipython)
Downloading traitlets-5.14.2-py3-none-any.whl.metadata (10 kB)
Collecting pexpect>4.3 (from ipython)
Downloading pexpect-4.9.0-py2.py3-none-any.whl.metadata (2.5 kB)
Collecting parso<0.9.0,>=0.8.3 (from jedi>=0.16->ipython)
Downloading parso-0.8.3-py2.py3-none-any.whl.metadata (7.5 kB)
Collecting ptyprocess>=0.5 (from pexpect>4.3->ipython)
Downloading ptyprocess-0.7.0-py2.py3-none-any.whl.metadata (1.3 kB)
Collecting wcwidth (from prompt-toolkit<3.1.0,>=3.0.41->ipython)
Downloading wcwidth-0.2.13-py2.py3-none-any.whl.metadata (14 kB)
Collecting executing>=1.2.0 (from stack-data->ipython)
Downloading executing-2.0.1-py2.py3-none-any.whl.metadata (9.0 kB)
Collecting asttokens>=2.1.0 (from stack-data->ipython)
Downloading asttokens-2.4.1-py2.py3-none-any.whl.metadata (5.2 kB)
Collecting pure-eval (from stack-data->ipython)
Downloading pure_eval-0.2.2-py3-none-any.whl.metadata (6.2 kB)
Collecting six>=1.12.0 (from asttokens>=2.1.0->stack-data->ipython)
Downloading six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
Downloading ipython-8.23.0-py3-none-any.whl (814 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 814.2/814.2 kB 1.2 MB/s eta 0:00:00
Downloading numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.0/18.0 MB 1.6 MB/s eta 0:00:00
Downloading jedi-0.19.1-py2.py3-none-any.whl (1.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 3.5 MB/s eta 0:00:00
Downloading pexpect-4.9.0-py2.py3-none-any.whl (63 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.8/63.8 kB 9.3 MB/s eta 0:00:00
Downloading prompt_toolkit-3.0.43-py3-none-any.whl (386 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 386.1/386.1 kB 3.7 MB/s eta 0:00:00
Downloading pygments-2.17.2-py3-none-any.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 2.4 MB/s eta 0:00:00
Downloading traitlets-5.14.2-py3-none-any.whl (85 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 85.4/85.4 kB 2.1 MB/s eta 0:00:00
Downloading decorator-5.1.1-py3-none-any.whl (9.1 kB)
Downloading matplotlib_inline-0.1.6-py3-none-any.whl (9.4 kB)
Downloading stack_data-0.6.3-py3-none-any.whl (24 kB)
Downloading asttokens-2.4.1-py2.py3-none-any.whl (27 kB)
Downloading executing-2.0.1-py2.py3-none-any.whl (24 kB)
Downloading parso-0.8.3-py2.py3-none-any.whl (100 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.8/100.8 kB 10.0 MB/s eta 0:00:00
Downloading ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB)
Downloading pure_eval-0.2.2-py3-none-any.whl (11 kB)
Downloading wcwidth-0.2.13-py2.py3-none-any.whl (34 kB)
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: wcwidth, pure-eval, ptyprocess, traitlets, six, pygments, prompt-toolkit, pexpect, parso, numpy, executing, decorator, matplotlib-inline, jedi, asttokens, stack-data, ipython
Successfully installed asttokens-2.4.1 decorator-5.1.1 executing-2.0.1 ipython-8.23.0 jedi-0.19.1 matplotlib-inline-0.1.6 numpy-1.26.4 parso-0.8.3 pexpect-4.9.0 prompt-toolkit-3.0.43 ptyprocess-0.7.0 pure-eval-0.2.2 pygments-2.17.2 six-1.16.0 stack-data-0.6.3 traitlets-5.14.2 wcwidth-0.2.13
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
--> 64348d5728e
STEP 5/5: ENTRYPOINT ["ipython"]
COMMIT my-ipython-image
--> 0b68cd1cf29
Successfully tagged localhost/my-ipython-image:latest
0b68cd1cf29fc16a196e1a80f24282adeb6b3c86d31e03a36ae6799553b0922b
Let's try out the newly created image by running it.
podman run --rm -it my-ipython-image
Output
Python 3.12.2 (main, Mar 12 2024, 11:02:14) [GCC 12.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.23.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]:
We end up in an IPython shell allowing us to interact like in an IPython shell installed in the usual manner.
Once we exit the shell, the container also stops running.
Let's see how this works by disassembling the Dockerfile
.
Disassembling the Dockerfile
The Dockerfile
used above contains four different types of instructions:
FROM <image>
- Sets the base image for the instructions below.
- Each valid
Dockerfile
must start with aFROM
instruction. - The image can be any valid image, e.g. from public registries. > Please note: Choose a trusted base image for your images. > We'll cover that topic in more detail in lesson 6 of this course.
LABEL <key>=<value> <key>=<value> <key>=<value> ...
- The
LABEL
instruction adds metadata to the image. - A
LABEL
is a key-value pair. - This is typically used to provide information about e.g. the maintainer of an image.
RUN <command>
- The
RUN
instruction executes any command on top of the current image. (We will cover this in a minute.) - The resulting image will be used as the base for the next step in the
Dockerfile
. ENTRYPOINT ["executable", "param1", "param2"]
- An
ENTRYPOINT
allows you to configure a container that runs as an executable. - Command line arguments to
podman run <image>
will be appended after all elements in the exec formENTRYPOINT
.
Example
podman run --rm -it my-ipython-image --version
Will give us the version number of IPython.
This is equivalent to executing ipython --version
, locally.
8.23.0
Let's build the image again and see what happens.
podman build -t my-ipython-image .
Output
STEP 1/5: FROM python:3.12
STEP 2/5: LABEL maintainer="support@hifis.net"
--> Using cache 073707692c48409d01057b24ee89ec07bbcb08c530948e1e972389b525d1dbf9
--> 073707692c4
STEP 3/5: RUN pip install --upgrade pip
--> Using cache 910ba09337d65dd4537dd02d9049d5265466ae44a55f596a7a32bcf41d18da5c
--> 910ba09337d
STEP 4/5: RUN pip install ipython numpy
--> Using cache 64348d5728e9c55850dc6aa7311757e2b34ab5441e544ebfae011f8bd6240028
--> 64348d5728e
STEP 5/5: ENTRYPOINT ["ipython"]
--> Using cache 0b68cd1cf29fc16a196e1a80f24282adeb6b3c86d31e03a36ae6799553b0922b
COMMIT my-ipython-image
--> 0b68cd1cf29
Successfully tagged localhost/my-ipython-image:latest
0b68cd1cf29fc16a196e1a80f24282adeb6b3c86d31e03a36ae6799553b0922b
This time, the output is much shorter than in our initial run of the podman build
command.
In each of the steps it is claimed to have used the cache.
As each instruction is executed, Podman looks for an existing image in its cache that has already been created in the same manner.
If there is such an image, Podman will re-use that image instead of creating a duplicate.
If you do not want Podman to use its cache, provide the --no-cache=true
option to the podman build
command.
Task: Create and Run a Data Science Image
Task Description
Your goal in this exercise is to create your own custom data science image as follows:
- Build your image on top of the latest Python image of release series
3.12
. - Mark yourself as the maintainer of the image.
- Install
numpy
,scipy
,pandas
,scikit-learn
andjupyterlab
usingpip install
. - Create a custom user using the command
useradd -ms /bin/bash jupyter
. - Tell the image to automatically start as the
jupyter
user and to use the working directory/home/jupyter
. - Make sure the image starts with the command
jupyter lab --ip=0.0.0.0
by default.
Solution
- Create a
Dockerfile
with below content.
FROM python:3.12
RUN pip install ipython jupyterlab numpy pandas scikit-learn
# Create a custom user under which the application runs
RUN useradd -ms /bin/bash jupyter
# Use this user by default for all subsequent operations
USER jupyter
# Default to start the container in the home directory of the jupyter user
WORKDIR /home/jupyter
# Publish port 8888 to the outside, for documentation purpose
EXPOSE 8888
ENTRYPOINT ["jupyter", "lab", "--ip=0.0.0.0"]
- Build the image.
podman build -t my-datascience-image .
- Run the image and bind port 8888.
podman run -p 8888:8888 -it --rm my-datascience-image
This yields an output as shown below. (Details may vary)
Output
[I 2024-04-01 14:19:55.406 ServerApp] jupyter_lsp | extension was successfully linked.
[I 2024-04-01 14:19:55.409 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2024-04-01 14:19:55.413 ServerApp] jupyterlab | extension was successfully linked.
[I 2024-04-01 14:19:55.414 ServerApp] Writing Jupyter server cookie secret to /home/jupyter/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2024-04-01 14:19:55.663 ServerApp] notebook_shim | extension was successfully linked.
[I 2024-04-01 14:19:55.678 ServerApp] notebook_shim | extension was successfully loaded.
[I 2024-04-01 14:19:55.680 ServerApp] jupyter_lsp | extension was successfully loaded.
[I 2024-04-01 14:19:55.681 ServerApp] jupyter_server_terminals | extension was successfully loaded.
[I 2024-04-01 14:19:55.682 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.12/site-packages/jupyterlab
[I 2024-04-01 14:19:55.682 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
[I 2024-04-01 14:19:55.682 LabApp] Extension Manager is 'pypi'.
[I 2024-04-01 14:19:55.709 ServerApp] jupyterlab | extension was successfully loaded.
[I 2024-04-01 14:19:55.710 ServerApp] Serving notebooks from local directory: /home/jupyter
[I 2024-04-01 14:19:55.710 ServerApp] Jupyter Server 2.13.0 is running at:
[I 2024-04-01 14:19:55.710 ServerApp] http://b9d43e460483:8888/lab?token=54269fbf8af372736a456bc27e53f3de8cb168031152b294
[I 2024-04-01 14:19:55.710 ServerApp] http://127.0.0.1:8888/lab?token=54269fbf8af372736a456bc27e53f3de8cb168031152b294
[I 2024-04-01 14:19:55.710 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 2024-04-01 14:19:55.717 ServerApp] No web browser found: Error('could not locate runnable browser').
[C 2024-04-01 14:19:55.717 ServerApp]
To access the server, open this file in a browser:
file:///home/jupyter/.local/share/jupyter/runtime/jpserver-1-open.html
Or copy and paste one of these URLs:
http://b9d43e460483:8888/lab?token=54269fbf8af372736a456bc27e53f3de8cb168031152b294
http://127.0.0.1:8888/lab?token=54269fbf8af372736a456bc27e53f3de8cb168031152b294
[I 2024-04-01 14:19:55.737 ServerApp] Skipped non-installed server(s): bash-language-server, dockerfile-language-server-nodejs, javascript-typescript-langserver, jedi-language-server, julia-language-server, pyright, python-language-server, python-lsp-server, r-languageserver, sql-language-server, texlab, typescript-language-server, unified-language-server, vscode-css-languageserver-bin, vscode-html-languageserver-bin, vscode-json-languageserver-bin, yaml-language-server