GradientTape 返回无自定义 CSI 损失函数

问题描述 投票:0回答:1

我正在尝试在 TensorFlow 中将自定义损失函数(关键成功指数)与我的简单 CNN(用于 64x64 像素图像)一起使用,我得到了梯度的 Nones 列表。

这里是自定义损失函数:

from keras import backend as K

@tf.function
def custom_csi_loss(y_true, y_pred):
    # Define the target class
    target_class = 1

    # Calculate the true positives, false positives, and false negatives
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    false_positives = K.sum(K.round(K.clip(y_pred - y_true, 0, 1)))
    false_negatives = K.sum(K.round(K.clip(y_true - y_pred, 0, 1)))

    # Calculate the CSI
    csi = true_positives / (true_positives + false_negatives + false_positives)

    # Return the negative of the CSI as the loss (since we want to minimize the loss)
    return -csi

这是模型:

def build_scnn(shape=(128, 128, 3), k_init="he_normal", dilation_rate=(1, 1), dtype=tf.float32):
    inputs = Input(shape=shape)
    normalized = BatchNormalization(axis=3)(inputs)

    x = Conv2D(64, 3, padding="same", activation="relu", kernel_initializer=k_init)(normalized)
    x = Conv2D(128, 3, padding="same", activation="relu", dilation_rate=dilation_rate, kernel_initializer=k_init)(x)
    x = Conv2D(128, 3, padding="same", activation="relu", kernel_initializer=k_init)(x)

    outputs = Conv2D(1, 1, padding="same", activation="sigmoid", dtype=dtype)(x)
    outputs = Reshape((64 * 64, 1))(outputs)
    scnn = Model(inputs, outputs, name="SCNN")
    return scnn

scnn = build_scnn(shape=(64, 64, len(gdf[features].columns)),
                  k_init=k_init,
                  dilation_rate=dilation_rate)

这是训练步骤:

@tf.function
def train_step(x, y):
    with tf.GradientTape(watch_accessed_variables=True) as tape:
        tape.watch(scnn.trainable_variables)
        y_pred = scnn(x, training=True)
        loss = loss_fn(y, y_pred)
    gradients = tape.gradient(loss, scnn.trainable_variables)  # differentiate loss wrt scnn weights
    print(f"gradients: {gradients}")
    optimizer.apply_gradients(zip(gradients, scnn.trainable_variables))
    return loss, y_pred

这里是代码的主体:

for epoch in range(epochs):
    epoch_loss = 0
    epoch_csi = 0
    num_batches = 0
    for x, y, w in train.map(weight_func):
        y = tf.cast(y, dtype=tf.float32)
        loss, y_pred = train_step(x, y)
        epoch_loss += loss
        epoch_csi += metrics[0](y, y_pred)
        num_batches += 1
    
    epoch_loss /= num_batches
    epoch_csi /= num_batches

gradients
变量始终为
[None, None, None, ...]
,其余代码失败。该代码适用于 keras.losses.BinaryCrossentropy 和其他二元损失函数,据我所知,它只能是
custom_csi_loss
函数的问题。我已经检查了y和y_pred的大小和数据类型,它们是一致的。

这个问题类似于以下问题,但他们没有解决我的问题:


(请帮忙!)
python tensorflow keras conv-neural-network gradienttape
1个回答
0
投票

这个问题的一个可能原因是自定义的CSI损失函数可能在某些点不可微,这会导致梯度未定义并返回None。要解决此问题,您可以尝试使用 tf.custom_gradient 装饰器来显式定义渐变。这是一个演示此的示例代码:

import tensorflow as tf
from keras import backend as K

def custom_csi_loss(y_true, y_pred):
    # some custom loss calculation
    # ...
    def grad_fn(dy):
        # define the gradient explicitly
        # ...
        return grad

    return loss, grad_fn

# create a model and compile it with the custom loss function
model = tf.keras.models.Sequential([...])
model.compile(loss=custom_csi_loss, optimizer='adam')

# train the model
history = model.fit([...])
© www.soinside.com 2019 - 2024. All rights reserved.