我在mnist上的喀拉拉邦训练了我自己的模型。我只有conv2d图层,因为我想在小图像(mnist:28x28 px)上训练网络,然后在1920x1080大图像上进行推理。
我的形状(用于训练):
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1 (Conv2D) (None, 28, 28, 64) 640
_________________________________________________________________
batch_normalization_117 (Bat (None, 28, 28, 64) 256
_________________________________________________________________
leaky_re_lu_117 (LeakyReLU) (None, 28, 28, 64) 0
_________________________________________________________________
max_pooling2d_119 (MaxPoolin (None, 14, 14, 64) 0
_________________________________________________________________
conv2 (Conv2D) (None, 14, 14, 128) 73856
_________________________________________________________________
batch_normalization_118 (Bat (None, 14, 14, 128) 512
_________________________________________________________________
leaky_re_lu_118 (LeakyReLU) (None, 14, 14, 128) 0
_________________________________________________________________
max_pooling2d_120 (MaxPoolin (None, 7, 7, 128) 0
_________________________________________________________________
conv3 (Conv2D) (None, 7, 7, 256) 295168
_________________________________________________________________
batch_normalization_119 (Bat (None, 7, 7, 256) 1024
_________________________________________________________________
leaky_re_lu_119 (LeakyReLU) (None, 7, 7, 256) 0
_________________________________________________________________
max_pooling2d_121 (MaxPoolin (None, 4, 4, 256) 0
_________________________________________________________________
conv4 (Conv2D) (None, 4, 4, 128) 295040
_________________________________________________________________
batch_normalization_120 (Bat (None, 4, 4, 128) 512
_________________________________________________________________
leaky_re_lu_120 (LeakyReLU) (None, 4, 4, 128) 0
_________________________________________________________________
max_pooling2d_122 (MaxPoolin (None, 2, 2, 128) 0
_________________________________________________________________
conv5 (Conv2D) (None, 1, 1, 10) 5130
=================================================================
Total params: 672,138
Trainable params: 670,986
Non-trainable params: 1,152
推断形状:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1 (Conv2D) (None, 1920, 1080, 64) 640
_________________________________________________________________
batch_normalization_113 (Bat (None, 1920, 1080, 64) 256
_________________________________________________________________
leaky_re_lu_113 (LeakyReLU) (None, 1920, 1080, 64) 0
_________________________________________________________________
max_pooling2d_115 (MaxPoolin (None, 960, 540, 64) 0
_________________________________________________________________
conv2 (Conv2D) (None, 960, 540, 128) 73856
_________________________________________________________________
batch_normalization_114 (Bat (None, 960, 540, 128) 512
_________________________________________________________________
leaky_re_lu_114 (LeakyReLU) (None, 960, 540, 128) 0
_________________________________________________________________
max_pooling2d_116 (MaxPoolin (None, 480, 270, 128) 0
_________________________________________________________________
conv3 (Conv2D) (None, 480, 270, 256) 295168
_________________________________________________________________
batch_normalization_115 (Bat (None, 480, 270, 256) 1024
_________________________________________________________________
leaky_re_lu_115 (LeakyReLU) (None, 480, 270, 256) 0
_________________________________________________________________
max_pooling2d_117 (MaxPoolin (None, 240, 135, 256) 0
_________________________________________________________________
conv4 (Conv2D) (None, 240, 135, 128) 295040
_________________________________________________________________
batch_normalization_116 (Bat (None, 240, 135, 128) 512
_________________________________________________________________
leaky_re_lu_116 (LeakyReLU) (None, 240, 135, 128) 0
_________________________________________________________________
max_pooling2d_118 (MaxPoolin (None, 120, 68, 128) 0
_________________________________________________________________
conv5 (Conv2D) (None, 119, 67, 10) 5130
=================================================================
Total params: 672,138
Trainable params: 670,986
Non-trainable params: 1,152
目标是使用输出类的尺寸创建卷积图像,该卷积图像代表大图像中的滑动窗口以进行推断。
但是keras不会让我训练,因为在最后一层中,它将减少previos层输出的my形状(从(batch,x,y,channels)到(batch,channels)):
ValueError: Error when checking target: expected conv5 to have 4 dimensions, but got array with shape (48000, 10)
形状必须为(48000,1,1,10)!!!我该怎么做才能防止这种情况?当我介绍扁平化和密集型时,以后就不能用它来推断大图像了吗?
感谢您的时间和帮助。
为了能够在不同的输入大小上进行训练和测试,您应该做两件事:
None
作为输入尺寸。GlobalAveragePooling2D
与Conv2D
层一起使用,其滤镜大小等于类别数。下面的示例代码可以创建一个模型,以便对具有任何输入大小的图像进行训练并进行推断(假设maxpooling和stride不会导致负尺寸)。
from keras import layers, Model
my_input = layers.Input(shape=(None, None, 1))
x = layers.Conv2D(filters=32, kernel_size=3, strides=1)(my_input)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling2D()(x)
x = layers.Conv2D(filters=64, kernel_size=3, strides=1)(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling2D()(x)
out = layers.Conv2D(filters=10, kernel_size=1, strides=1)(x)
out = layers.GlobalAveragePooling2D()(out)
out = layers.Activation('softmax')(out)
model = Model(my_input, out)
model.summary()
模型摘要打印此:
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, None, None, 1) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, None, None, 32) 320
_________________________________________________________________
batch_normalization_1 (Batch (None, None, None, 32) 128
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, None, None, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, None, None, 64) 18496
_________________________________________________________________
batch_normalization_2 (Batch (None, None, None, 64) 256
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, None, None, 64) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, None, None, 10) 650
_________________________________________________________________
global_average_pooling2d_1 ( (None, 10) 0
_________________________________________________________________
activation_1 (Activation) (None, 10) 0
=================================================================
Total params: 19,850
Trainable params: 19,658
Non-trainable params: 192