我对 caffe 很陌生,正在尝试针对我的具体情况使用别人的代码。
我把它缩小到这个最小的例子:
import caffe
import numpy as np
model_path = "mnist_model_rgb/caffe_model/mnist_model_rgb.caffemodel"
model_definition = 'mnist_model_rgb/caffe_model/mnist_model_rgb.prototxt'
net = caffe.Classifier(model_definition, model_path, mean=np.float32([0.131, 0.131, 0.131]), channel_swap=(2, 1, 0))
w = h = 28
start_image = np.random.normal(np.float32([175.0, 175.0, 175.0]), 8, (w, h, 3))
start_image = np.float32(np.rollaxis(start_image, 2)[::-1]) - net.transformer.mean['data']
src = net.blobs['data']
src.reshape(1, 3, h, w)
src.data[0] = start_image
layer = 'flat'
print("before forward: ", net.blobs[layer].data)
net.forward()
print("after forward: ", net.blobs[layer].data)
这是我的模型描述:
layer {
name: "data"
type: "Input"
top: "data"
input_param {
shape {
dim: 1
dim: 3
dim: 28
dim: 28
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 32
kernel_size: 3
}
}
layer {
name: "max1"
type: "Pooling"
bottom: "conv1"
top: "mp1"
pooling_param {
pool: MAX
kernel_size: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "mp1"
top: "conv2"
convolution_param {
num_output: 64
kernel_size: 3
}
}
layer {
name: "max2"
type: "Pooling"
bottom: "conv2"
top: "mp2"
pooling_param {
pool: MAX
kernel_size: 2
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "mp2"
top: "conv3"
convolution_param {
num_output: 64
kernel_size: 3
}
}
layer {
name: "flat"
type: "Flatten"
bottom: "conv3"
top: "flat"
}
layer {
name: "dense1"
type: "InnerProduct"
bottom: "flat"
top: "dense1"
inner_product_param {
num_output: 64
}
}
layer {
name: "dense2"
type: "InnerProduct"
bottom: "dense1"
top: "dense2"
inner_product_param {
num_output: 10
}
}
这是我得到的输出:
before forward: [[0. 0. 0. ... 0. 0. 0.]]
after forward: [[-350974.78 -351307.06 -350903.56 ... -296046.62 -295620.03 -297597.94]]
net.forward()
之后的值对我来说似乎不正确。
当我使用作者预期的模型 (https://www.robots.ox.ac.uk/~vgg/software/vgg_face/) 运行相同的脚本时,我得到了这个结果,这似乎更合理:
before forward: [[0. 0. 0. ... 0. 0. 0.]]
after forward: [[0.04778102 0. 0.18989444 ... 2.3401992 0. 0. ]]
有人可以向我解释一下,为什么我的模型的输出与示例模型如此不同?我应该期待得到的结果吗?或者我只是做错了什么,我怎样才能做得更好?
任何解释或帮助表示赞赏!