当我尝试向模型添加扰动并优化扰动本身而不是模型参数时。我确实喜欢以下内容,非常简单:
import torch
import torch.nn as nn
import torch.optim as optim
# Define a simple model
class SimpleModel(nn.Module):
def __init__(self):
super(SimpleModel, self).__init__()
self.linear = nn.Linear(1, 1)
def forward(self, x, perturbation, clean):
self.linear.weight.data = perturbation + clean
return self.linear(x)
# Generate clean dataset
X = torch.tensor([[1.0], [2.0], [3.0], [4.0]], dtype=torch.float32)
Y = 2 * X
# Create the model
model = SimpleModel()
# Define perturbation as a separate parameter
perturbation = torch.tensor(0.1, requires_grad=True)
clean = model.linear.weight.data
# Train the model and optimize the perturbation
criterion = nn.MSELoss()
optimizer_perturbation = optim.SGD([perturbation], lr=0.01) # Optimize the perturbation
for epoch in range(100):
optimizer_perturbation.zero_grad()
outputs = model(X,clean,perturbation)
loss = criterion(outputs, Y)
loss.backward()
# Update the perturbation
optimizer_perturbation.step()
但是,执行loss.backward()后,扰动的梯度仍然是None。我不明白为什么。是什么导致了这个问题?我应该怎样做才能实现我需要的结果?
我认为执行loss.backward()后perturbation.grad不应该是None,但它是。
#before executing loss.backward()
model.linear.weight.grad
None
perturbation.grad
None
#after executing loss.backward()
perturbation.grad
None
model.linear.weight.grad
tensor([[-26.8524]])
您追踪
loss.backward()
,但您从不打电话给perturbation.backward()
。
如果将
perturbation.backward()
添加到循环中,则 perturbation.grad
应该返回 smth。