我正在尝试在 TensorFlow 中将自定义损失函数(关键成功指数)与我的简单 CNN(用于 64x64 像素图像)一起使用,我得到了梯度的 Nones 列表。
这里是自定义损失函数:
from keras import backend as K
@tf.function
def custom_csi_loss(y_true, y_pred):
# Define the target class
target_class = 1
# Calculate the true positives, false positives, and false negatives
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
false_positives = K.sum(K.round(K.clip(y_pred - y_true, 0, 1)))
false_negatives = K.sum(K.round(K.clip(y_true - y_pred, 0, 1)))
# Calculate the CSI
csi = true_positives / (true_positives + false_negatives + false_positives)
# Return the negative of the CSI as the loss (since we want to minimize the loss)
return -csi
这是模型:
def build_scnn(shape=(128, 128, 3), k_init="he_normal", dilation_rate=(1, 1), dtype=tf.float32):
inputs = Input(shape=shape)
normalized = BatchNormalization(axis=3)(inputs)
x = Conv2D(64, 3, padding="same", activation="relu", kernel_initializer=k_init)(normalized)
x = Conv2D(128, 3, padding="same", activation="relu", dilation_rate=dilation_rate, kernel_initializer=k_init)(x)
x = Conv2D(128, 3, padding="same", activation="relu", kernel_initializer=k_init)(x)
outputs = Conv2D(1, 1, padding="same", activation="sigmoid", dtype=dtype)(x)
outputs = Reshape((64 * 64, 1))(outputs)
scnn = Model(inputs, outputs, name="SCNN")
return scnn
scnn = build_scnn(shape=(64, 64, len(gdf[features].columns)),
k_init=k_init,
dilation_rate=dilation_rate)
这是训练步骤:
@tf.function
def train_step(x, y):
with tf.GradientTape(watch_accessed_variables=True) as tape:
tape.watch(scnn.trainable_variables)
y_pred = scnn(x, training=True)
loss = loss_fn(y, y_pred)
gradients = tape.gradient(loss, scnn.trainable_variables) # differentiate loss wrt scnn weights
print(f"gradients: {gradients}")
optimizer.apply_gradients(zip(gradients, scnn.trainable_variables))
return loss, y_pred
这里是代码的主体:
for epoch in range(epochs):
epoch_loss = 0
epoch_csi = 0
num_batches = 0
for x, y, w in train.map(weight_func):
y = tf.cast(y, dtype=tf.float32)
loss, y_pred = train_step(x, y)
epoch_loss += loss
epoch_csi += metrics[0](y, y_pred)
num_batches += 1
epoch_loss /= num_batches
epoch_csi /= num_batches
gradients
变量始终为[None, None, None, ...]
,其余代码失败。该代码适用于 keras.losses.BinaryCrossentropy 和其他二元损失函数,据我所知,它只能是 custom_csi_loss
函数的问题。我已经检查了y和y_pred的大小和数据类型,它们是一致的。
这个问题类似于以下问题,但他们没有解决我的问题:
这个问题的一个可能原因是自定义的CSI损失函数可能在某些点不可微,这会导致梯度未定义并返回None。要解决此问题,您可以尝试使用 tf.custom_gradient 装饰器来显式定义渐变。这是一个演示此的示例代码:
import tensorflow as tf
from keras import backend as K
def custom_csi_loss(y_true, y_pred):
# some custom loss calculation
# ...
def grad_fn(dy):
# define the gradient explicitly
# ...
return grad
return loss, grad_fn
# create a model and compile it with the custom loss function
model = tf.keras.models.Sequential([...])
model.compile(loss=custom_csi_loss, optimizer='adam')
# train the model
history = model.fit([...])