NGC | Catalog
CatalogContainersrapidsai/rapidsai

rapidsai/rapidsai

Logo for rapidsai/rapidsai
Description
The RAPIDS suite of software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs.
Publisher
Open Source
Latest Tag
cuda11.4-runtime-ubuntu20.04-py3.10
Modified
March 3, 2024
Compressed Size
8.08 GB
Multinode Support
No
Multi-Arch Support
No
cuda11.4-runtime-ubuntu20.04-py3.10 (Latest) Security Scan Results

Linux / amd64

Sorry, your browser does not support inline SVG.

RAPIDS - Open GPU Data Science

Deprecation Notice

Starting with the RAPIDS v23.08 release, this docker repository is deprecated.

For a complete list of changes, please see this GitHub issue.

What is RAPIDS?

Visit rapids.ai for more information.

The RAPIDS suite of software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

NOTE: Review our prerequisites section below to ensure your system meets the minimum requirements for RAPIDS.

Current Version - RAPIDS v23.06

Versions of libraries included in the 23.06 images:

Image Types

The RAPIDS images are based on nvidia/cuda, and are intended to be drop-in replacements for the corresponding CUDA images in order to make it easy to add RAPIDS libraries while maintaining support for existing CUDA applications.

RAPIDS images come in three types, distributed in two different repos:

This repo (rapidsai), contains the following:

  • base - contains a RAPIDS environment ready for use.
    • TIP: Use this image if you want to use RAPIDS as a part of your pipeline.
  • runtime - extends the base image by adding a notebook server and example notebooks.
    • TIP: Use this image if you want to explore RAPIDS through notebooks and examples.

The rapidsai/rapidsai-dev repo contains the following:

  • devel - contains the full RAPIDS source tree, pre-built with all artifacts in place, and the compiler toolchain, the debugging tools, the headers and the static libraries for RAPIDS development.
    • TIP: Use this image to develop RAPIDS from source.

Image Tag Naming Scheme

The tag naming scheme for RAPIDS images incorporates key platform details into the tag as shown below:

23.06-cuda11.8-runtime-ubuntu22.04-py3.10
 ^       ^    ^        ^         ^
 |       |    type     |         python version
 |       |             |
 |       cuda version  |
 |                     |
 RAPIDS version        linux version

To get the latest RAPIDS version of a specific platform combination, simply exclude the RAPIDS version. For example, to pull the latest version of RAPIDS for the runtime image with support for CUDA 11.8, Python 3.10, and Ubuntu 18.04, use the following tag:

cuda11.8-runtime-ubuntu22.04

Many users do not need a specific platform combination but would like to ensure they're getting the latest version of RAPIDS, so as an additional convenience, a tag named simply latest is also provided which is equivalent to cuda11.8-runtime-ubuntu22.04-py3.10.

Prerequisites

Usage

Start Container and Notebook Server

Preferred - Docker CE v19+ and nvidia-container-toolkit
$ docker pull nvcr.io/nvidia/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10
$ docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \
         nvcr.io/nvidia/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10
Legacy - Docker CE v18 and nvidia-docker2
$ docker pull nvcr.io/nvidia/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10
$ docker run --runtime=nvidia --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \
         nvcr.io/nvidia/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10

Container Ports

The following ports are used by the runtime containers only (not base containers):

Environment Variables

The following environment variables can be passed to the docker run commands:

  • DISABLE_JUPYTER - set to true to disable the default Jupyter server from starting (not applicable for base images)
  • JUPYTER_FG - set to true to start Jupyter server in foreground instead of background (not applicable for base images)
  • EXTRA_APT_PACKAGES - (Ubuntu images only) used to install additional apt packages in the container. Use a space separated list of values
  • APT_TIMEOUT - (Ubuntu images only) how long (in seconds) the apt command should wait before exiting
  • EXTRA_YUM_PACKAGES - (CentOS images only) used to install additional yum packages in the container. Use a space separated list of values
  • YUM_TIMEOUT - (CentOS images only) how long (in seconds) the yum command should wait before exiting
  • EXTRA_CONDA_PACKAGES - used to install additional conda packages in the container. Use a space separated list of values
  • CONDA_TIMEOUT - how long (in seconds) the conda command should wait before exiting
  • EXTRA_PIP_PACKAGES - used to install additional pip packages in the container. Use a space separated list of values
  • PIP_TIMEOUT - how long (in seconds) the pip command should wait before exiting

Example:

$ docker run \
    --rm \
    -it \
    --gpus all \
    -e EXTRA_APT_PACKAGES="vim nano" \
    -e EXTRA_CONDA_PACKAGES="jq" \
    -e EXTRA_PIP_PACKAGES="beautifulsoup4" \
    -p 8888:8888 \
    -p 8787:8787 \
    -p 8786:8786 \
    nvcr.io/nvidia/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10

Bind Mounts

Mounting files/folders to the locations specified below provide additional functionality for the images.

  • /opt/rapids/environment.yml - a YAML file that contains a list of dependencies that will be installed by conda. The file should look like:
dependencies:
  - beautifulsoup4
  - jq

Example:

$ docker run \
    --rm \
    -it \
    --gpus all \
    -v $(pwd)/environment.yml:/opt/rapids/environment.yml \
    nvcr.io/nvidia/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10

Use JupyterLab to Explore the Notebooks

Notebooks can be found in the following directories within the 23.06 container (not applicable for base images):

  • /rapids/notebooks/cugraph - cuGraph demo notebooks
  • /rapids/notebooks/cuml - cuML demo notebooks
  • /rapids/notebooks/cusignal - cuSignal demo notebooks
  • /rapids/notebooks/cuxfilter - cuXfilter demo notebooks
  • /rapids/notebooks/cuspatial - cuSpatial demo notebooks
  • /rapids/notebooks/xgboost - XGBoost demo notebooks

For a full description of each notebook, see the README in the notebooks repository.

Extending RAPIDS Images

All RAPIDS images use conda as their package manager, and all RAPIDS packages (including source-built) are available in the rapids conda environment. If you want to extend RAPIDS images (such as using FROM), then it is important to include source activate rapids at the start of all RUN commands in your Dockerfile. Without this, the docker build context will not have access to the RAPIDS libraries, as it uses the base environment by default. Examples of this can be found in our own Dockerfiles, which can be found in the RAPIDS Docker Repository on GitHub.

Custom Data and Advanced Usage

You are free to modify the above steps. For example, you can launch an interactive session with your own data:

Preferred - Docker CE v19+ and nvidia-container-toolkit
$ docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \
         -v /path/to/host/data:/rapids/my_data \
         nvcr.io/nvidia/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10
Legacy - Docker CE v18 and nvidia-docker2
$ docker run --runtime=nvidia --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \
         -v /path/to/host/data:/rapids/my_data \
         nvcr.io/nvidia/rapidsai/rapidsai:23.06-cuda11.8-runtime-ubuntu22.04-py3.10

This will map data from your host operating system to the container OS in the /rapids/my_data directory. You may need to modify the provided notebooks for the new data paths.

Access Documentation within Notebooks

You can check the documentation for RAPIDS APIs inside the JupyterLab notebook using a ? command, like this:

[1] ?cudf.read_csv

This prints the function signature and its usage documentation. If this is not enough, you can see the full code for the function using ??:

[1] ??pygdf.read_csv

Check out the RAPIDS documentation for more detailed information and a RAPIDS cheat sheet.

More Information

Check out the RAPIDS and XGBoost API docs.

Learn how to setup a multi-node cuDF and XGBoost data preparation and distributed training environment by following the mortgage data example notebook and scripts.

Where can I get help or file bugs/requests?

Please submit issues with the container to this GitHub repository: https://github.com/rapidsai/docker

For issues with RAPIDS libraries like cuDF, cuML, RMM, or others file an issue in the related GitHub project.

Additional help can be found on Stack Overflow or Google Groups.

License

By pulling and using the container, you accept the terms and conditions of this End User License Agreement.