现在,我有一个如下的用例
A
是像下面这样的 torch.tensor
A =
[[1,x,y],
[1,2,3],
[1,z,3]]
只有
x, y, z
中的元素A
是可微的,其他元素只是常数。
例如,如果
cost = tr(A.A)
cost = 14 + 2 x + 2 y + 6 z
当我进行回溯时,我只想针对
x, y, z
进行微分和更新。当然这个例子只是一个玩具例子,而不是真正复杂的例子。
我怎样才能实现这样的用例?
我认为实现它的唯一方法是使用掩模张量将您不想修改的元素的梯度归零,如下所示:
A = torch.tensor([[1.0, 100, 100], [1, 2, 3], [1, 100, 3]])
# I have initialized your x,y,z to 100, so the mask should only have these set to 1
mask = (A==100).to(dtype=torch.float32)
#optimizer to backpropagate
opt = torch.optim.SGD([A], lr=1e-2)
# optimization loop
for i in range(10):
# backprop the loss gradient
(A.matmul(A)).diag().pow(2).sum().backward()
# mask the gradients
with torch.no_grad():
A.grad *= mask
# optimization step
opt.step()
A.grad.zero_()
print(A)
只需修改 x,y,z 坐标即可
我想出了一个方法:
import torch
# Define variables x, y, z as differentiable tensors
x = torch.tensor([2.0], requires_grad=True)
y = torch.tensor([2.0], requires_grad=True)
z = torch.tensor([2.0], requires_grad=True)
print(x.grad, y.grad, z.grad) # None, None, None
# Create tensor A with only x, y, z being differentiable
A = torch.tensor([[1, x.item(), y.item()],
[1, 2, 3],
[1, z.item(), 3]], requires_grad=False)
# Put x, y, z back into A, this time they are differentiable
A[0, 1] = x
A[0, 2] = y
A[2, 1] = z
# Compute the loss function
cost = torch.trace(A @ A)
# Backpropagate to compute the gradients
cost.backward()
# Output the gradients
print(x.grad, y.grad, z.grad) # 2, 2, 6