如何为softmax激活功能实现反向传递？

Question

我正在研究神经网络，并试图实现softmax激活函数的后向传递。设置和我未完成的尝试都在这里：

def backward(self, Z: np.ndarray, dY: np.ndarray) -> np.ndarray:
        """Backward pass for softmax activation.

        Z   input to `forward` method (any shape)
        dY  derivative of loss w.r.t. the output of this layer
            same shape as `Z`

        Returns: derivative of loss w.r.t. input of this layer
        """

        lst = []
        sigmas = self.forward(Z)
        for i in range(sigmas.shape[0]):
            sample_pt = sigmas[i]
            J = np.diagflat(sample_pt) - (sample_pt @ sample_pt.T)

这是批处理方法。因此，我需要做的是获取参数Z，该参数具有形状（样本点数量，要素数量）和dY，其等效于$ \ frac {\ partial L} {\ partial Y} $其中$ Y $是softmax的输出，并返回$ \ frac {\ partial {L}} {\ partial Z} $，其中$ Z $是softmax的输入。

在我的代码中，我调用self.forward(Z)以获取softmax的输出。然后，由于这是一个批处理方法，因此我计算了Z.shape[0]雅可比矩阵，最终需要使用dY，然后汇总这些矩阵以返回与Z形状相同的矩阵。

例如，说

# We have two input points with 3 features each

# Outputs of softmax for two data points
sigmas = np.array([[0.81761761, 0.08738232, 0.09500007],
                   [0.12135669, 0.84312089, 0.03552242]])

# Partial of L w.r.t. Y, the output of the softmax
dLdY = np.array([[ 1.74481176, -0.7612069 ,  0.3190391 ],
                 [-0.24937038,  1.46210794, -2.06014071]])
然后运行我的代码的一部分，

for i in range(sigmas.shape[0]):
    sample_pt = sigmas[i]
    J = np.diagflat(sample_pt) - (sample_pt @ sample_pt.T)
    print(J)
给予

[[ 0.13245837 -0.68515924 -0.68515924]
 [-0.68515924 -0.59777692 -0.68515924]
 [-0.68515924 -0.68515924 -0.59015917]]
[[-0.60548542 -0.72684212 -0.72684212]
 [-0.72684212  0.11627877 -0.72684212]
 [-0.72684212 -0.72684212 -0.6913197 ]]
也就是说，我们有两个3x3雅可比矩阵，两个数据点各一个。我对1）和dY（我猜是某种矩阵乘法）结合起来该怎么做，以及2）如何以某种方式将结果聚合到$ \ frac {\部分L} {\部分Z} $，在这个迷你示例中，其形状为2x3。

感谢您的帮助！

我正在研究神经网络，并试图实现softmax激活函数的后向传递。设置和我未完成的尝试都在这里：def back（self，Z：np.ndarray，dY：np ....

Answer 1

[通常，在ML中，我们遵循形状约定-例如：dL/dy将具有与y相同的形状。以dLdy开头的形状应为（3,1）。进行分批反向传播的标准是从数据点获取所有梯度贡献的平均值/均值。因此，第1步将是确保dLdy处于适当的形状，即（3,1）

如何为softmax激活功能实现反向传递？

问题描述投票：2回答：1

1个回答

最新问题

如何为softmax激活功能实现反向传递？

问题描述 投票：2回答：1

1个回答

最新问题

问题描述投票：2回答：1