NGC | Catalog
Logo for Kaldi
Description
Kaldi is an open-source software framework for speech processing.
Publisher
NVIDIA
Latest Tag
23.11-py3
Modified
February 5, 2024
Compressed Size
4.28 GB
Multinode Support
No
Multi-Arch Support
No
23.11-py3 (Latest) Security Scan Results

Linux / amd64

Sorry, your browser does not support inline SVG.

What Is Kaldi?

The Kaldi Speech Recognition Toolkit project began in 2009 at Johns Hopkins University with the intent of developing techniques to reduce both the cost and time required to build speech recognition systems. While originally focused on ASR support for new languages and domains, the Kaldi project has steadily grown in size and capabilities, enabling hundreds of researchers to participate in advancing the field. Now the de-facto speech recognition toolkit in the community, Kaldi helps to enable speech services used by millions of people every day.

Building Kaldi

This container has Kaldi pre-built and ready to use in /opt/kaldi, however, if you want to rebuild, run:

> make -j -C /opt/kaldi/src/

In addition, the source can be found in /opt/kaldi/src.

Running Kaldi

Before you can run an NGC deep learning framework container, your Docker environment must support NVIDIA GPUs. To run a container, issue the appropriate command as explained in the Running A Container chapter in the NVIDIA Containers And Frameworks User Guide and specify the registry, repository, and tags. For more information about using NGC, refer to the NGC Container User Guide.

The method implemented in your system depends on the DGX OS version installed (for DGX systems), the specific NGC Cloud Image provided by a Cloud Service Provider, or the software that you have installed in preparation for running NGC containers on TITAN PCs, Quadro PCs, or vGPUs.

Procedure

  1. Select the Tags tab and locate the container image release that you want to run.
  2. In the Pull Tag column, click the icon to copy the docker pull command.
  3. Open a command prompt and paste the pull command. The pulling of the container image begins. Ensure the pull completes successfully before proceeding to the next step.
  4. Run the container image.

If you have Docker 19.03 or later, a typical command to launch the container is:

 docker run --gpus all -it --rm -v local_dir:container_dir nvcr.io/nvidia/kaldi:xx.xx-py3

If you have Docker 19.02 or earlier, a typical command to launch the container is:

 nvidia-docker run -it --rm -v local_dir:container_dir nvcr.io/nvidia/kaldi:xx.xx-py3

Where:

  • -it means run in interactive mode
  • --rm will delete the container when finished
  • -v is the mounting directory
  • local_dir is the directory or file from your host system (absolute path) that you want to access from inside your container. For example, the local_dir in the following path is /home/jsmith/data/mnist.
 -v /home/jsmith/data/mnist:/data/mnist

If you are inside the container, for example, ls /data/mnist, you will see the same files as if you issued the ls /home/jsmith/data/mnist command from outside the container.

  • container_dir is the target directory when you are inside your container. For example, /data/mnist is the target directory in the example:
-v /home/jsmith/data/mnist:/data/mnist
  • xx.xx is the container version. For example, 20.01.

See /workspace/README.md inside the container for information on customizing your Kaldi image.

LibriSpeech Example

An example has been provided and can be found here: nvidia-examples/librispeech/

To run the example you will first have to prepare the model:

cd /workspace/nvidia-examples/librispeech/
./prepare_data.sh 

Once the model is prepared you can run a speech to text benchmark as follows:

cd /workspace/nvidia-examples/librispeech/
./run_benchmark.sh

Suggested Reading

For the latest Release Notes, see the Kaldi Release Notes Documentation website.

Link to Open Source Code

For a full list of the supported software and specific versions that come packaged with this framework based on the container image, see the Frameworks Support Matrix.

For more information about Kaldi, including tutorials, documentation, and examples, see the Kaldi Speech Recognition Toolkit. The open-source project can be found here.

Security CVEs

To review known CVEs on the 21.07 image, please refer to the Known Issues section of the Product Release Notes.

License

By pulling and using the container, you accept the terms and conditions of this End User License Agreement.