我正在尝试训练一个神经网络,直到其梯度的 L2 范数在 0 的 10e-3 范围内;因此,我的代码包括定义在拟合过程中计算的参数和梯度。我不断遇到障碍,让我觉得我没有正确获取参数或梯度。
这是我的代码:
def get_theta(self):
theta = self.parameters().detach().cpu
return theta
def J_loss(self, xb, yb):
#forward returns x so here it will return x on GPU
#return cross_entropy result of xb and yb on GPU
return F.cross_entropy(self.forward(xb.to(device)), yb.to(device))
def fit(self, loader, epochs = 1999):
norm2Gradient = 1
while norm2Gradient >10e-3 and epochs <2000:
#grad = []
for _, batch in enumerate(loader):
x, y = batch['x'], batch['y']
#computes f.cross_entropy loss of (xb,yb) on GPU
loss = self.J_loss(x,y)
#print("loss:", loss)
#computes new gradients
grad = loss.backward()
#print("grad:",grad)
print("grad?",grad)
#takes one step along new gradients to decrease the loss; updates parameters
self.optimizer.step()
#captures new parameters
theta = self.parameters()
print("theta:",theta)
#collects gradient along new parameters
for param in theta:
grad.append(param.grad)
#computes gradient norm
norm2Gradient = torch.linalg.norm(grad)
sumNorm2Gradient += norm2Gradient.detach().cpu
#clears out old gradients
self.optimizer.zero_grad()
return sumNorm2Gradient
当前错误消息“AttributeError: 'NoneType' object has no attribute 'append'”出现在以下行:
grad.append(param.grad)
此外,变量“grad”的打印结果是“None”。我梳理了文档,试图找出代码中每一行的作用以及如何提取梯度和参数。如何正确获取梯度?