PolarSPARC

How-to Enable NVidia GPU for Docker


Bhaskar S 06/24/2023


Overview


With all the buzz and spotlight around AI/ML these days, it is inevitable for developers in an Enterprise to start integrating their business application(s) with the future AI/ML products. Majority of the AI/ML products depend on GPU enabled platforms to run efficiently, which is currently dominated by NVidia.

Most of the Enterprise business application(s) run in Docker containers these days. Hence it goes without saying, that for the AI/ML enabled business application(s) to run efficiently in the container environment, one needs to enable the GPU access to the Docker container.

Enter the NVidia Container Toolkit - which enables the Enterprise developers to build and run GPU enabled Docker containers.

The following diagram illustrates the high-level architecture of the Docker and NVidia integration:


Docker NVidia

The NVidia Container Toolkit includes a runtime driver, which enables Docker containers to access the underlying NVidia GPUs. The toolkit under-the-hood leverages the Compute Unified Device Architecture (or CUDA ) software framework to access the parallel computing power of the NVidia GPUs for faster data processing.


Installation and Setup


The installation and setup will be performed on a Linux desktop with a NVidia graphics installed and running Ubuntu 22.04 LTS operating system.

Open a Terminal window to perform the various steps.

To perform a system update and install the prerequisite software, execute the following command:

$ sudo apt update && sudo apt install apt-transport-https ca-certificates curl software-properties-common -y

The following would be a typical trimmed output:

Output.1

...[ SNIP ]...
ca-certificates is already the newest version (20211016ubuntu0.22.04.1).
ca-certificates set to manually installed.
The following additional packages will be installed:
  python3-software-properties software-properties-gtk
The following NEW packages will be installed:
  apt-transport-https curl
The following packages will be upgraded:
  python3-software-properties software-properties-common software-properties-gtk
3 upgraded, 2 newly installed, 0 to remove and 14 not upgraded.
...[ SNIP ]...

To add the Docker package repository, execute the following commands:

$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

$ echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu jammy stable" | sudo tee /etc/apt/sources.list.d/docker.list

The following would be a typical output:

Output.2

deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu jammy stable

To install docker, execute the following command:

$ sudo apt update && sudo apt install docker-ce containerd.io docker-compose-plugin -y

The following would be a typical trimmed output:

Output.3

...[ SNIP ]...
Get:5 https://download.docker.com/linux/ubuntu jammy InRelease [48.9 kB]  
Get:6 https://download.docker.com/linux/ubuntu jammy/stable amd64 Packages [13.6 kB]
...[ SNIP ]...

To add the logged in user alice to the group docker, execute the following command:

$ sudo usermod -aG docker ${USER}

REBOOT the system for the changes to take effect.

To verify docker installation was ok, execute the following command:

$ docker version

The following would be a typical output:

Output.4

Client: Docker Engine - Community
Version:           24.0.2
API version:       1.43
Go version:        go1.20.4
Git commit:        cb74dfc
Built:             Thu May 25 21:51:00 2023
OS/Arch:           linux/amd64
Context:           default

Server: Docker Engine - Community
Engine:
  Version:          24.0.2
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.4
  Git commit:       659604f
  Built:            Thu May 25 21:51:00 2023
  OS/Arch:          linux/amd64
  Experimental:     false
containerd:
  Version:          1.6.21
  GitCommit:        3dce8eb055cbb6872793272b4f20ed16117344f8
runc:
  Version:          1.1.7
  GitCommit:        v1.1.7-0-g860f061
docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

To verify the appropriate NVidia drivers have been installed in the Linux desktop, execute the following command:

$ nvidia-smi

The following would be a typical output:

Output.5

Sat Jun 24 09:23:28 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.116.04   Driver Version: 525.116.04   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:04:00.0  On |                  N/A |
|  0%   48C    P8    24W / 220W |    369MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                                
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1442      G   /usr/lib/xorg/Xorg                308MiB |
|    0   N/A  N/A      3508      G   ...RendererForSitePerProcess       57MiB |
+-----------------------------------------------------------------------------+

To test the access of the NVidia GPU in docker, we will need some kind of a docker image. In order to perform the test, we will use the docker image nvidia/cuda:12.1.0-base-ubuntu22.04, which was the latest at the time of this article.

To download above mentioned docker image, execute the following command:

$ docker pull nvidia/cuda:12.1.0-base-ubuntu22.04

The following would be a typical output:

Output.6

12.1.0-base-ubuntu22.04: Pulling from nvidia/cuda
6b851dcae6ca: Pull complete 
532bc0192ccd: Pull complete 
f9bcd94e513a: Pull complete 
971bd89a1a36: Pull complete 
a2855a2ef2e0: Pull complete 
Digest: sha256:937bda11a3146c55374c9a201bef12945f2ba98394ff0c46bd04807dc949ab51
Status: Downloaded newer image for nvidia/cuda:12.1.0-base-ubuntu22.04
docker.io/nvidia/cuda:12.1.0-base-ubuntu22.04

To test the access of the NVidia GPU from docker, execute the following command:

$ docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

The following would be a typical output:

Output.7

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

From the Output.7 above, it is evident that docker has no access to the underlying NVidia GPU in the system.

To add the NVidia toolkit repository, execute the following commands:

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

$ echo "deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/ubuntu22.04/amd64 /" | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

The following would be a typical output:

Output.8

deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/ubuntu22.04/amd64 /

To perform a system update and to install the NVidia-Docker runtime integration, execute the following command:

$ sudo apt update && sudo apt install -y nvidia-docker2

The following would be a typical trimmed output:

Output.9

...[ SNIP ]...
Selecting previously unselected package libnvidia-container1:amd64.
(Reading database ... 582595 files and directories currently installed.)
Preparing to unpack .../libnvidia-container1_1.13.2-1_amd64.deb ...
Unpacking libnvidia-container1:amd64 (1.13.2-1) ...
Selecting previously unselected package libnvidia-container-tools.
Preparing to unpack .../libnvidia-container-tools_1.13.2-1_amd64.deb ...
Unpacking libnvidia-container-tools (1.13.2-1) ...
Selecting previously unselected package nvidia-container-toolkit-base.
Preparing to unpack .../nvidia-container-toolkit-base_1.13.2-1_amd64.deb ...
Unpacking nvidia-container-toolkit-base (1.13.2-1) ...
Selecting previously unselected package nvidia-container-toolkit.
Preparing to unpack .../nvidia-container-toolkit_1.13.2-1_amd64.deb ...
Unpacking nvidia-container-toolkit (1.13.2-1) ...
Selecting previously unselected package nvidia-docker2.
Preparing to unpack .../nvidia-docker2_2.13.0-1_all.deb ...
Unpacking nvidia-docker2 (2.13.0-1) ...
Setting up nvidia-container-toolkit-base (1.13.2-1) ...
Setting up libnvidia-container1:amd64 (1.13.2-1) ...
Setting up libnvidia-container-tools (1.13.2-1) ...
Setting up nvidia-container-toolkit (1.13.2-1) ...
Setting up nvidia-docker2 (2.13.0-1) ...
Processing triggers for libc-bin (2.35-0ubuntu3.1) ...
...[ SNIP ]...

Once again, REBOOT the system for the changes to take effect.

Finally, to test the access of the NVidia GPU from docker, execute the following command:

$ docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

The following would be a typical output:

Output.10

Sat Jun 24 09:31:19 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.116.04   Driver Version: 525.116.04   CUDA Version: 12.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:04:00.0  On |                  N/A |
|  0%   43C    P8    24W / 220W |    515MiB /  8192MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                                
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

WALLA !!! - we have successfully integrated the NVidia GPU runtime with the docker environment.


References


NVidia Documentation



© PolarSPARC