Deploy your containerized AI applications with nvidia-docker

Deploy your containerized AI applications with nvidia-docker

Extra and much more products and companies are taking edge of the modeling and prediction abilities of AI. This posting provides the nvidia-docker resource for integrating AI (Artificial Intelligence) computer software bricks into a microservice architecture. The most important gain explored here is the use of the host system’s GPU (Graphical Processing Device) means to speed up several containerized AI apps.

To comprehend the usefulness of nvidia-docker, we will start off by describing what variety of AI can gain from GPU acceleration. Secondly we will present how to carry out the nvidia-docker device. Finally, we will describe what equipment are obtainable to use GPU acceleration in your purposes and how to use them.

Why applying GPUs in AI apps?

In the subject of artificial intelligence, we have two key subfields that are utilised: machine learning and deep studying. The latter is component of a larger household of equipment discovering techniques based mostly on artificial neural networks.

In the context of deep learning, the place operations are fundamentally matrix multiplications, GPUs are additional productive than CPUs (Central Processing Units). This is why the use of GPUs has grown in current several years. Without a doubt, GPUs are considered as the coronary heart of deep finding out for the reason that of their massively parallel architecture.

Having said that, GPUs can not execute just any method. In truth, they use a distinct language (CUDA for NVIDIA) to acquire benefit of their architecture. So, how to use and converse with GPUs from your apps?

The NVIDIA CUDA technologies

NVIDIA CUDA (Compute Unified Unit Architecture) is a parallel computing architecture blended with an API for programming GPUs. CUDA interprets application code into an instruction set that GPUs can execute.

A CUDA SDK and libraries this kind of as cuBLAS (Basic Linear Algebra Subroutines) and cuDNN (Deep Neural Network) have been produced to communicate quickly and successfully with a GPU. CUDA is accessible in C, C++ and Fortran. There are wrappers for other languages which includes Java, Python and R. For illustration, deep understanding libraries like TensorFlow and Keras are dependent on these systems.

Why applying nvidia-docker?

Nvidia-docker addresses the demands of builders who want to add AI features to their programs, containerize them and deploy them on servers powered by NVIDIA GPUs.

The aim is to set up an architecture that permits the progress and deployment of deep learning models in providers readily available by means of an API. Therefore, the utilization level of GPU assets is optimized by earning them available to various software situations.

In addition, we reward from the benefits of containerized environments:

  • Isolation of situations of each AI model.
  • Colocation of various models with their particular dependencies.
  • Colocation of the very same design beneath a number of variations.
  • Constant deployment of models.
  • Product effectiveness monitoring.

Natively, employing a GPU in a container demands putting in CUDA in the container and giving privileges to obtain the gadget. With this in head, the nvidia-docker device has been created, making it possible for NVIDIA GPU equipment to be exposed in containers in an isolated and secure fashion.

At the time of composing this report, the hottest variation of nvidia-docker is v2. This variation differs significantly from v1 in the next approaches:

  • Model 1: Nvidia-docker is applied as an overlay to Docker. That is, to make the container you had to use nvidia-docker (Ex: nvidia-docker run ...) which performs the steps (amongst other individuals the development of volumes) enabling to see the GPU gadgets in the container.
  • Version 2: The deployment is simplified with the substitute of Docker volumes by the use of Docker runtimes. Indeed, to launch a container, it is now vital to use the NVIDIA runtime by way of Docker (Ex: docker run --runtime nvidia ...)

Be aware that due to their diverse architecture, the two versions are not suitable. An software penned in v1 will have to be rewritten for v2.

Location up nvidia-docker

The expected aspects to use nvidia-docker are:

  • A container runtime.
  • An readily available GPU.
  • The NVIDIA Container Toolkit (key section of nvidia-docker).



A container runtime is needed to operate the NVIDIA Container Toolkit. Docker is the encouraged runtime, but Podman and containerd are also supported.

The formal documentation provides the set up technique of Docker.


Motorists are expected to use a GPU product. In the circumstance of NVIDIA GPUs, the drivers corresponding to a offered OS can be received from the NVIDIA driver download web site, by filling in the info on the GPU design.

The set up of the drivers is accomplished by using the executable. For Linux, use the subsequent commands by changing the name of the downloaded file:

chmod +x NVIDIA-Linux-x86_64-470.94.operate

Reboot the host equipment at the conclusion of the installation to choose into account the mounted drivers.

Setting up nvidia-docker

Nvidia-docker is accessible on the GitHub task webpage. To install it, stick to the set up handbook relying on your server and architecture particulars.

We now have an infrastructure that lets us to have isolated environments supplying entry to GPU means. To use GPU acceleration in applications, a number of applications have been produced by NVIDIA (non-exhaustive checklist):

  • CUDA Toolkit: a established of equipment for building software package/packages that can perform computations applying equally CPU, RAM, and GPU. It can be utilized on x86, Arm and Electric power platforms.
  • NVIDIA cuDNN: a library of primitives to accelerate deep mastering networks and enhance GPU effectiveness for major frameworks these types of as Tensorflow and Keras.
  • NVIDIA cuBLAS: a library of GPU accelerated linear algebra subroutines.

By working with these applications in application code, AI and linear algebra duties are accelerated. With the GPUs now obvious, the application is able to mail the facts and functions to be processed on the GPU.

The CUDA Toolkit is the least expensive amount selection. It features the most handle (memory and recommendations) to develop customized apps. Libraries provide an abstraction of CUDA features. They permit you to focus on the software growth rather than the CUDA implementation.

The moment all these components are applied, the architecture working with the nvidia-docker support is ready to use.

Below is a diagram to summarize anything we have found:



We have set up an architecture making it possible for the use of GPU sources from our purposes in isolated environments. To summarize, the architecture is composed of the following bricks:

  • Functioning system: Linux, Windows …
  • Docker: isolation of the natural environment utilizing Linux containers
  • NVIDIA driver: set up of the driver for the hardware in concern
  • NVIDIA container runtime: orchestration of the preceding three
  • Purposes on Docker container:
    • CUDA
    • cuDNN
    • cuBLAS
    • Tensorflow/Keras

NVIDIA carries on to develop resources and libraries around AI technologies, with the aim of developing alone as a chief. Other technologies could enhance nvidia-docker or may well be much more suitable than nvidia-docker dependent on the use case.

Leave a Reply