我使用 GPU 驱动程序初始化操作 在我的 Dataproc 2.1 集群中安装 Nvidia 驱动程序:
gcloud dataproc clusters create my-cluster \
--image-version 2.1-ubuntu20 \
--master-machine-type n1-standard-32 \
--master-accelerator type=nvidia-tesla-t4,count=1 \
--worker-machine-type n1-standard-32 \
--worker-accelerator type=nvidia-tesla-t4,count=1 \
--initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/gpu/install_gpu_driver.sh
看起来包安装成功了,但是没有用:
$ nvidia-smi -c DEFAULT
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
$ lsmod | grep nvidia
(empty)
$ sudo modprobe nvidia
modprobe: ERROR: could not insert 'nvidia': Operation not permitted