Tensorflow 2.10.0 未检测到 GPU

Question

我创建了一个conda环境并安装了tensorflow，如下所示：

conda create -n foo python=3.10
conda activate foo
conda install mamba
mamba install tensorflow -c conda-forge
mamba install cudnn cudatoolkit

这安装了 TensorFlow 2.10.0。我已经安装了 CUDA 11.2 和 cuDNN 8.1，然后尝试运行以下命令：

import tensorflow as tf

print(f"GPUs available: {tf.config.list_physical_devices('GPU')}")

但它只返回一个空列表。我想将 3060ti 用于我的 ML 项目，但 TensorFlow 未检测到它。我发现了与我类似的问题，例如 this、this 和 this，但他们使用旧版本的 TensorFlow，它将安装

tensorflow-gpu

并且不再受支持。我该如何解决这个问题，或者尝试解决它。

我使用的是 Windows 10 机器

nvidia-smi

的输出：

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 528.24       Driver Version: 528.24       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ... WDDM  | 00000000:09:00.0  On |                  N/A |
| 30%   43C    P8    16W / 200W |    809MiB /  8192MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      7176    C+G   ...perience\NVIDIA Share.exe    N/A      |
|    0   N/A  N/A      9240    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A     12936    C+G   ...cw5n1h2txyewy\LockApp.exe    N/A      |
|    0   N/A  N/A     13652    C+G   ...e\PhoneExperienceHost.exe    N/A      |
|    0   N/A  N/A     14020    C+G   ...2txyewy\TextInputHost.exe    N/A      |
|    0   N/A  N/A     14888    C+G   ...ser\Application\brave.exe    N/A      |
|    0   N/A  N/A     15112    C+G   ...5n1h2txyewy\SearchApp.exe    N/A      |
|    0   N/A  N/A     16516    C+G   ...oft OneDrive\OneDrive.exe    N/A      |
|    0   N/A  N/A     18296    C+G   ...aming\Spotify\Spotify.exe    N/A      |
|    0   N/A  N/A     18624    C+G   ...in7x64\steamwebhelper.exe    N/A      |
|    0   N/A  N/A     18672    C+G   ...\app-1.0.9010\Discord.exe    N/A      |
|    0   N/A  N/A     18828    C+G   ...lPanel\SystemSettings.exe    N/A      |
|    0   N/A  N/A     19284    C+G   ...Central\Razer Central.exe    N/A      |
|    0   N/A  N/A     20020    C+G   ...arp.BrowserSubprocess.exe    N/A      |
|    0   N/A  N/A     22912    C+G   ...8wekyb3d8bbwe\Cortana.exe    N/A      |
|    0   N/A  N/A     24848    C+G   ...ontend\Docker Desktop.exe    N/A      |
|    0   N/A  N/A     25804    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     27064    C+G   ...8bbwe\WindowsTerminal.exe    N/A      |
+-----------------------------------------------------------------------------+

nvcc -V

的输出：

Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_22:08:44_Pacific_Standard_Time_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

我运行了这样的虚拟代码：

import tensorflow as tf
import numpy as np


def make_nn():
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Dense(1, input_shape=(1,)))
    model.compile(loss='mean_squared_error', optimizer='sgd')
    return model

def dataset():
    x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
    y = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
    return tf.data.Dataset.from_tensor_slices((x, y)).batch(1)



def main():
    model = make_nn()
    model.fit(dataset(), epochs=1, steps_per_epoch=9)

if __name__ == '__main__':
    print(f"GPUs available: {tf.config.list_physical_devices('GPU')}")
    print(f"Built with cuda: {tf.test.is_built_with_cuda()}")

    main()

它给了我以下日志：

GPUs available: []
Built with cuda: False
2023-02-06 09:47:32.744450: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-06 09:47:32.779280: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.

看起来它正在使用CPU构建

Answer 1

可能不是最好的解决方案，但我将 TensorFlow 降级回之前安装的 2.6.0 版本并且可以正常工作，这很糟糕，我想尝试一些更新的功能，但目前看来这就足够了。如果有人面临同样的问题，这是我当前使用的 conda 环境

Answer 2

我通过使用 python = 3.9 和 tensorflow = 2.8.1 创建另一个环境来解决它。另外，我有cuda =11.4

Answer 3

感谢@Corralien，我对 Areias 也有同样的问题，但我通过从 NVIDIA 网站下载正确版本的 cudnn 解决了我的问题。之前不知道Win11和conda虚拟环境都需要安装cudnn

Answer 4

如果您使用conda-forge，您可能需要设置环境变量 CONDA_OVERRIDE_CUDA 强制安装支持 GPU 的 Tensorflow 版本，如下所述https://conda-forge.org/docs/user/tipsandtricks.html#installing-cuda-enabled-packages-like-tensorflow-and-pytorch。在 bash 下，会是这样的

CONDA_OVERRIDE_CUDA="11.2" conda install "tensorflow==2.8" -c conda-forge

Tensorflow 2.10.0 未检测到 GPU

问题描述投票：0回答：4

4个回答

最新问题

Tensorflow 2.10.0 未检测到 GPU

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4