I’m new to docker so forgive me if it’s a dumb question but recently, I have been trying to test faster-whisper
, a reimplementation of OpenAI’s Whisper, and to test this, I used a docker container with a image from docker hub’s nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04
on ubuntu, wsl2. I successfully built the image but when I run it, this error shows up:
==========
== CUDA ==
==========
CUDA Version 11.7.1
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
Traceback (most recent call last):
File "/home/user/Documents/experiment/./main.py", line 8, in <module>
model = WhisperModel(model_size, device="cuda", compute_type="float16")
File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 128, in __init__
self.model = ctranslate2.models.Whisper(
RuntimeError: CUDA failed with error no CUDA-capable device is detected
I’ve googled and tried a lot of possible solutions, even fully reinstalled nvidia driver, nvidia-container-toolkit, and docker but the error seems to persist.
This is the Dockerfile I used to build the image:
FROM nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04
RUN apt -y update && apt -y install python3.11 python3-pip
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES utility,compute
WORKDIR /home/user/Documents/experiment
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD [ "python3", "./main.py" ]
and this is the python script, which I mostly copy-pasted from the official github page:
from faster_whisper import WhisperModel
import os
os.environ['CUDA_VISIBLE_DEVICES'] = "0"
model_size = "large-v2"
# Run on GPU with FP16
model = WhisperModel(model_size, device="cuda", compute_type="float16")
# or run on GPU with INT8
# model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
# or run on CPU with INT8
# model = WhisperModel(model_size, device="cpu", compute_type="int8")
segments, info = model.transcribe("./audio.wav", beam_size=5)
print("Detected language '%s' with probability %f" % (info.language, info.language_probability))
for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
this is the command I attempted to run the container:
docker run --gpus all --runtime=nvidia -t nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04
and here is the nvidia-smi output by commanddocker run --gpus all --runtime=nvidia -t nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04 nvidia-smi
:
Sun Oct 8 22:08:24 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.112 Driver Version: 537.42 CUDA Version: 11.7 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3080 Ti On | 00000000:01:00.0 On | N/A |
| 0% 39C P8 21W / 400W | 522MiB / 12288MiB | 1% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 32 G /Xwayland N/A |
+---------------------------------------------------------------------------------------+
any ideas?