`检查失败:使用预训练的Keras模型进行转移学习时,cudnnSetTensorNdDescriptor`

问题描述 投票:3回答:2

[我正在尝试将Imagenet预训练的一种架构从keras.applications转移到CIFAR-10,但是我遇到了CUDA错误(当我尝试适应时,jupyter笔记本内核会在最后一行立即崩溃。我的模型)。可能出什么问题了]

输出:

2019-01-10 00:39:40.165264: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-01-10 00:39:40.495421: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.2405
pciBusID: 0000:01:00.0
totalMemory: 11.93GiB freeMemory: 11.63GiB
2019-01-10 00:39:40.495476: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-01-10 00:39:40.819773: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-01-10 00:39:40.819812: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0
2019-01-10 00:39:40.819819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N
2019-01-10 00:39:40.820066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/device:GPU:0 with 11256 MB memory) -> physical GPU (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:01:00.0, compute capability: 5.2)
2019-01-10 00:39:40.844280: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-01-10 00:39:40.844307: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-01-10 00:39:40.844313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0
2019-01-10 00:39:40.844317: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N
2019-01-10 00:39:40.844520: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11256 MB memory) -> physical GPU (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:01:00.0, compute capability: 5.2)
[I 00:40:58.262 NotebookApp] Saving file at /Untitled.ipynb
2019-01-10 00:42:56.543392: F tensorflow/stream_executor/cuda/cuda_dnn.cc:542] Check failed: cudnnSetTensorNdDescriptor(handle_.get(), elem_type, nd, dims.data(), strides.data()) == CUDNN_STATUS_SUCCESS (3 vs. 0)batch_descriptor: {count: 32 feature_map_count: 320 spatial: 0 0  value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}

产品编号:

from keras.applications.inception_resnet_v2 import InceptionResNetV2
from keras.preprocessing import image
from keras.layers import Dense, GlobalAveragePooling2D
from keras.models import Model
import keras.utils
import numpy as np
from keras.datasets import cifar10

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

# Define model
base_model = InceptionResNetV2(weights='imagenet', include_top=False)
x = base_model.output
print(x.shape)
x = GlobalAveragePooling2D()(x)
x = Dense(1024,activation='relu')(x)
preds = Dense(10,activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=preds)
# Only fine-tune last layer
for layer in base_model.layers:
    layer.trainable = False

model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
machine-learning keras cudnn
2个回答
1
投票

检查对InceptionResnetV2网络输入的要求:


0
投票

您解决了这个问题吗?我遇到了同样的问题,但这很奇怪,我可以运行我的网络,直到达到第N个(N不是常数),然后引发错误:“ 2019-09-24 21:03:16.109522:F tensorflow / stream_executor / cuda / cuda_dnn.cc:521]检查失败:cudnnSetTensorNdDescriptor(handle_.get(),elem_type,nd,dims.data(),strides.data())== CUDNN_STATUS_SUCCESS( 3 vs. 0)batch_descriptor:{计数:0 feature_map_count:32空间:16 112 128 value_min:0.000000 value_max:0.000000 layout:BatchDepthYX}“我检查了代码,似乎没有错误〜是什么原因引起的? GPU内存还是cuda和cudnn的版本?如果您得到答复,请告诉我,非常感谢。我的电子邮件:[email protected]

© www.soinside.com 2019 - 2024. All rights reserved.