Tensorflow/Keras 模型再训练仅更新非零权重

Question

我有一个修剪过的 TF 模型，我需要用流数据重新训练它。模型包括嵌入层（10 个节点）、GRU 层（16 个注释）和分类层（85 个节点）。我只想重新训练 80% 修剪模型的非零权重。我想避免创建掩码和执行额外的计算，因为我想尽量减少再训练时间。这是我目前正在使用的训练功能。

@tf.function
def train_step(inputs, targets):
    with tf.GradientTape() as tape:
       predictions = model(inputs, training=True)
       loss_value = loss_fn(targets, predictions)
    grads = tape.gradient(loss_value, model.trainable_weights)
    optimizer.apply_gradients(zip(grads, model.trainable_weights))
    return loss_value

有什么办法可以只计算非零权重的梯度，或者只使用

model.trainable_weights

中的非零权重？如果不是，有什么方法可以使用 tf.IndexedSlices 有效地更新非零权重？

我也尝试了下面的代码（而不是上面代码中的 grads）。但它不起作用。

grads_and_vars = [(tf.IndexedSlices(g.values * tf.cast(tf.math.greater(v, 0), 
tf.float32), g.indices), v) for g, v in zip(grads, model_retrained.trainable_weights)]
optimizer.apply_gradients(grads_and_vars)

非常感谢对此的任何支持

Tensorflow/Keras 模型再训练仅更新非零权重

问题描述投票：0回答：0

最新问题

Tensorflow/Keras 模型再训练仅更新非零权重

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0