TensorFlow 中自定义层中的变量不存在梯度

问题描述 投票:0回答:1

我正在尝试在 TensorFlow 中构建自定义一维卷积层。我已经检查过该层是否做了应该做的事情。然而,当我将其插入顺序 Keras 模型时,我收到警告,自定义层中的变量不存在渐变。

您能解释一下为什么会发生这种情况以及我该如何解决它吗?

这是代码

import tensorflow as tf
import numpy as np


class customC1DLayer(tf.keras.layers.Layer):
    def __init__(self, filter_size = 1, activation = None ,**kwargs):
        super(customC1DLayer, self).__init__(**kwargs)
        self.filter_size = filter_size
        self.activation = tf.keras.activations.get(activation)
    
    def build(self, input_shape):
        self.filter = self.add_weight('filter', shape=[self.filter_size, ], trainable=True, dtype=tf.float32)

        self.padding = tf.Variable(initial_value=tf.zeros(shape=[input_shape[-1] - self.filter_size, ], dtype=tf.float32), trainable=False)
        padded_filter = tf.concat([self.filter, self.padding], axis=0)
        col = tf.concat([padded_filter[:1], tf.zeros_like(padded_filter[1:])], axis=0)
        self.augmented_filter = tf.linalg.LinearOperatorToeplitz(padded_filter, col).to_dense()
     
    def call(self, inputs):
        outputs = tf.transpose(tf.matmul(self.augmented_filter, inputs, transpose_b=True))
        if self.activation is not None:
            outputs = self.activation(outputs)
        return outputs

为了解释代码,在方法构建中,我初始化了一些权重,例如 [a b c],然后augmented_filter 只是循环矩阵 [[a b c 0 0], [0 a b c 0], [0 0 a b c]]

我知道使用不可微函数时可能会发生此类错误。然而,在这种情况下,据我所知,我只使用应该可微分的矩阵运算。

tensorflow conv-neural-network keras-layer
1个回答
0
投票

问题是,在调用函数中,没有从

augmented_filter
padding
的路径——我猜你想要后者的渐变。就目前情况而言,该变量实际上并未被使用,因此无法计算任何梯度。您需要在
call
:

内完成此转换
class customC1DLayer(tf.keras.layers.Layer):
    def __init__(self, filter_size = 1, activation = None ,**kwargs):
        super(customC1DLayer, self).__init__(**kwargs)
        self.filter_size = filter_size
        self.activation = tf.keras.activations.get(activation)
    
    def build(self, input_shape):
        self.filter = self.add_weight('filter', shape=[self.filter_size, ], trainable=True, dtype=tf.float32)

        self.padding = tf.Variable(initial_value=tf.zeros(shape=[input_shape[-1] - self.filter_size, ], dtype=tf.float32), trainable=False)
     
    def call(self, inputs):

        padded_filter = tf.concat([self.filter, self.padding], axis=0)
        col = tf.concat([padded_filter[:1], tf.zeros_like(padded_filter[1:])], axis=0)
        augmented_filter = tf.linalg.LinearOperatorToeplitz(padded_filter, col).to_dense()
        outputs = tf.transpose(tf.matmul(augmented_filter, inputs, transpose_b=True))
        if self.activation is not None:
            outputs = self.activation(outputs)
        return outputs
© www.soinside.com 2019 - 2024. All rights reserved.