为什么模型的权重没有全部更新？

Question

我正在制作一个预测成绩的模型。它是利用数学建模计算出的值，用RNN进行计算的模型。然而，从数学建模所用的权重到RNN的权重，一切都没有更新。检查结果，每个权重的梯度都输出为None。

我想知道为什么体重没有更新。

class Expector(nn.Module):
  def __init__(self, input_size, hidden_size, num_layers):
    super(Expector, self).__init__()

    self.weight_0 = nn.Parameter(torch.tensor([0.2]))
    self.weight_1 = nn.Parameter(torch.tensor([0.8, 0.7, 0.6, 0.5])) #homeownership weight
    self.weight_2 = nn.Parameter(torch.tensor([0.5 for i in range(0, 13)])) #loan_purpose weight

    #interesst_rate_expector
    self.num_features = 13
    self.linear_1 = torch.nn.Linear(self.num_features, self.num_features*2)
    self.linear_2 = torch.nn.Linear(self.num_features*2, self.num_features*4)
    self.linear_3 = torch.nn.Linear(self.num_features*4, self.num_features*8)
    self.linear_out = torch.nn.Linear(self.num_features*8, 1)

    #loan_rating_expector
    self.rnn = nn.RNN(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=True)
    self.fc = nn.Linear(hidden_size, input_size)

  def interest_rate_expector(self, x):
    x = [x[i] for i in range(len(x)-1)]
    for i in [0,4,9,10,11]:
      if x[i] > 0:
        x[i] = math.log(x[i])
    input = torch.tensor(x)
    out1 = self.linear_1(input)
    out1 = torch.nn.functional.softplus(out1)
    out2 = self.linear_2(out1)
    out2 = torch.nn.functional.softplus(out2)
    out3 = self.linear_3(out2)
    out3 = torch.nn.functional.softplus(out3)
    logits = self.linear_out(out3)
    interest_rate = torch.sigmoid(logits)

    return interest_rate

  def mathematical_modeling(self, x):
    salary = x[4] / 12 if x[4] != 0 else 300000
    debt = x[4] * (1 / x[5]) if x[5] != 0 else 0

    interest_rate = self.interest_rate_expector(x).item()
    if x[2] > 4:
        repayment = (x[0] * interest_rate * ((1 + interest_rate) ** x[1])) / ((1 + interest_rate) ** x[1]) - 1
        result = (debt * self.weight_0.item() + repayment) / (salary * x[3])
    else:
        result = (debt * self.weight_0.item() + x[0] * x[1]) / (salary * x[3])

    return torch.tensor(result, requires_grad=True)

  def forward(self, x):
    x[3] = torch.matmul(one_hot_encoding1(x[3].item()).to(torch.float32), self.weight_1.view(-1, 1))
    x[7] = torch.matmul(one_hot_encoding2(x[7].item()).to(torch.float32), self.weight_2.view(-1, 1))

    x = torch.tensor(x)
    tensor = torch.tensor([x[6], x[7], x[8], x[10], x[11], x[12], self.mathematical_modeling(x).item()])
    tensor = tensor.unsqueeze(0)
    tensor.requires_grad_(True)
    print("Gradient of tensor:", tensor.grad)
    output, _ = self.rnn(tensor)
    final_output = self.fc(output)
    output_probabilities = F.softmax(final_output, dim=1)
    return output_probabilities

我已明确设置 requests_grad=True。
mathematical_modeling 函数尝试返回张量。
我调整了学习率。

尽管做出了这些努力，梯度仍未更新，当我检查时，我只看到短语“传感器梯度：无”。请帮助我。

Answer 1

你在这里做了很多奇怪的事情，打破了梯度链。

当您在张量上调用

.item()

时，会将其转换为标量 Python 值，这会删除任何梯度跟踪。

添加

tensor.requires_grad_(True)

仅跟踪从该点开始的渐变。由于您通过在各个点调用

.item()

来计算该张量，因此梯度无法向后传播。

此外，在您对损失调用

.backward()

之前，不会计算梯度，因此无论您在前向传播中的语句

print("Gradient of tensor:", tensor.grad)

总是返回

None

。

总体而言，除了实际建模部分之外，您的模型似乎还在模型本身内进行特征工程（即

salary

和

debt

计算）。你想把它们分开。

还有一个小问题，看起来你的模型正在创建一个固定长度的向量（

tensor = torch.tensor([x[6], x[7], x[8], x[10], x[11], x[12], self.mathematical_modeling(x).item()])

）。如果是这样的话，就不需要使用 RNN，只需使用 MLP 即可。

为什么模型的权重没有全部更新？

问题描述投票：0回答：1

1个回答

最新问题

为什么模型的权重没有全部更新？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1