TensorFlow 中自定义层中的变量不存在梯度

Question

我正在尝试在 TensorFlow 中构建自定义一维卷积层。我已经检查过该层是否做了应该做的事情。然而，当我将其插入顺序 Keras 模型时，我收到警告，自定义层中的变量不存在渐变。

您能解释一下为什么会发生这种情况以及我该如何解决它吗？

这是代码

import tensorflow as tf
import numpy as np


class customC1DLayer(tf.keras.layers.Layer):
    def __init__(self, filter_size = 1, activation = None ,**kwargs):
        super(customC1DLayer, self).__init__(**kwargs)
        self.filter_size = filter_size
        self.activation = tf.keras.activations.get(activation)
    
    def build(self, input_shape):
        self.filter = self.add_weight('filter', shape=[self.filter_size, ], trainable=True, dtype=tf.float32)

        self.padding = tf.Variable(initial_value=tf.zeros(shape=[input_shape[-1] - self.filter_size, ], dtype=tf.float32), trainable=False)
        padded_filter = tf.concat([self.filter, self.padding], axis=0)
        col = tf.concat([padded_filter[:1], tf.zeros_like(padded_filter[1:])], axis=0)
        self.augmented_filter = tf.linalg.LinearOperatorToeplitz(padded_filter, col).to_dense()
     
    def call(self, inputs):
        outputs = tf.transpose(tf.matmul(self.augmented_filter, inputs, transpose_b=True))
        if self.activation is not None:
            outputs = self.activation(outputs)
        return outputs

为了解释代码，在方法构建中，我初始化了一些权重，例如 [a b c]，然后augmented_filter 只是循环矩阵 [[a b c 0 0], [0 a b c 0], [0 0 a b c]]

我知道使用不可微函数时可能会发生此类错误。然而，在这种情况下，据我所知，我只使用应该可微分的矩阵运算。

Answer 1

问题是，在调用函数中，没有从

augmented_filter

到

padding

的路径——我猜你想要后者的渐变。就目前情况而言，该变量实际上并未被使用，因此无法计算任何梯度。您需要在

call

:

内完成此转换

class customC1DLayer(tf.keras.layers.Layer):
    def __init__(self, filter_size = 1, activation = None ,**kwargs):
        super(customC1DLayer, self).__init__(**kwargs)
        self.filter_size = filter_size
        self.activation = tf.keras.activations.get(activation)
    
    def build(self, input_shape):
        self.filter = self.add_weight('filter', shape=[self.filter_size, ], trainable=True, dtype=tf.float32)

        self.padding = tf.Variable(initial_value=tf.zeros(shape=[input_shape[-1] - self.filter_size, ], dtype=tf.float32), trainable=False)
     
    def call(self, inputs):

        padded_filter = tf.concat([self.filter, self.padding], axis=0)
        col = tf.concat([padded_filter[:1], tf.zeros_like(padded_filter[1:])], axis=0)
        augmented_filter = tf.linalg.LinearOperatorToeplitz(padded_filter, col).to_dense()
        outputs = tf.transpose(tf.matmul(augmented_filter, inputs, transpose_b=True))
        if self.activation is not None:
            outputs = self.activation(outputs)
        return outputs

TensorFlow 中自定义层中的变量不存在梯度

问题描述投票：0回答：1

1个回答

最新问题

TensorFlow 中自定义层中的变量不存在梯度

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1