import numpy as np
from tensorflow.keras.layers import Input, Dense, Flatten
from tensorflow.keras.models import Model # TF 2.2.0
#%%#######################################################
ipt = Input(batch_shape=(128, 28, 28, 1))
x = Flatten()(ipt)
out = Dense(10, activation='softmax')(x)
model = Model(ipt, out)
model.compile('adam', 'categorical_crossentropy')
#%%#######################################################
x = np.random.uniform(0, 1, model.input_shape)
pred = model(x, training=True) # =False also works
loss = model.compiled_loss(pred, pred)
print(loss)
输出:
tf.Tensor(1.9904033, shape=(), dtype=float32)
怎么了?
这仅仅是因为categorical_crossentropy
损失的工作方式。如果尝试使用[0,0,0,1,0,0,0,0,0,0]
,则为零。如果您在原始代码中将categorical_crossentropy
更改为mse
,也会得到零。
import numpy as np
import tensorflow as tf # TF 2.2.0
from tensorflow.keras.layers import Input, Dense, Flatten
from tensorflow.keras.models import Model
ipt = Input(shape=(28, 28, 1))
x = Flatten()(ipt)
out = Dense(10, activation='softmax')(x)
model = Model(ipt, out)
model.compile('adam', 'categorical_crossentropy')
label = tf.one_hot([5,3,2], depth=10)
# tf.Tensor(
# [[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
# [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
# [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]], shape=(3, 10), dtype=float32)
loss = model.compiled_loss(label, label)
print(loss) # tf.Tensor(1.1920929e-07, shape=(), dtype=float32)
编辑:
numpy
损失的categorical crossentropy
实现是:
import numpy as np
def cce(y_label,y_pred):
return np.sum(-y_label*np.log(y_pred))
x = np.random.uniform(0, 1, (10,))
print(cce(x,x)) # which yields values like 1.9904033
这显示了为什么它不为零的原因,因为您采用了预测的log
并将其乘以标签并得出负数。因此,what's the deal
的问题是:this is working as intended
。