如何手动向模型参数添加偏差,并使该偏差可训练用于梯度反向传播和更新

问题描述 投票:0回答:1

我有一些来自不同领域的任务,并且有一个名为

MetaModel
的基本模型可以执行回归。我想在训练不同任务时向基本模型的参数添加特定于任务的偏差
MetaModel
。目前,我使用的是下面代码中所示的方法,但是在梯度反向传播期间,
MetaModelWithBias.biases
的梯度始终为
None
。我想知道如何更新
MetaModelWithBias.biases
的参数以使它们可学习。

任务特定偏差的关键代码如下:

for param in self.meta_model.parameters():
     param.data += self.biases[task_id]

完整代码:

import torch
import torch.nn as nn
import torch.optim as optim

class MetaModelWithBias(nn.Module):
    def __init__(self, meta_model, num_tasks):
        super(MetaModelWithBias, self).__init__()
        self.meta_model = meta_model
        self.biases = nn.ParameterList([nn.Parameter(torch.randn(1)) for _ in range(num_tasks)])

    def forward(self, x, task_id):
        with torch.no_grad():
            for param in self.meta_model.parameters():
                param.data += self.biases[task_id]
        output = self.meta_model(x)
        return output

class MetaModel(nn.Module):
    def __init__(self):
        super(MetaModel, self).__init__()
        self.fc = nn.Linear(10, 1)

    def forward(self, x):
        return self.fc(x)



# random some data
num_tasks = 5
input_size = 10
num_samples = 100
num_tasks = 5
X = torch.randn(num_samples, input_size)
task_ids = torch.randint(0, num_tasks, (num_samples,))

# create model
meta_model = MetaModel()
meta_model_with_bias = MetaModelWithBias(meta_model, num_tasks)

# training
num_epochs=10
optimizer = optim.SGD([
    {'params': meta_model_with_bias.meta_model.parameters()},
    {'params': meta_model_with_bias.biases.parameters() }
], lr=0.01)
for epoch in range(num_epochs):
    outputs =0
    for i in range(num_samples):
        task_id = task_ids[i]
        output = meta_model_with_bias(X[i].unsqueeze(0), task_id)
        outputs+=output

    criterion = nn.MSELoss()
    targets = torch.randn(1, 1)
    loss = criterion(outputs, targets)

    print('\tMetaModelWithBias biases before backward',[parms for parms in meta_model_with_bias.biases],[parms.grad for parms in meta_model_with_bias.biases])
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    print('\tMetaModelWithBias biases after backward',[parms for parms in meta_model_with_bias.biases],[parms.grad for parms in meta_model_with_bias.biases])

    print('meta_model_with_bias params')
    for name, parms in meta_model_with_bias.named_parameters():
        print('-->name:', name, '-->grad_requirs:', parms.requires_grad,
              ' -->parameter:', parms,
              ' -->grad_value:', parms.grad)

然后我得到:

    MetaModelWithBias biases before backward [Parameter containing:
tensor([-0.2255], requires_grad=True), Parameter containing:
tensor([-2.2697], requires_grad=True), Parameter containing:
tensor([-0.7426], requires_grad=True), Parameter containing:
tensor([0.0925], requires_grad=True), Parameter containing:
tensor([-0.0564], requires_grad=True)] [None, None, None, None, None]
    MetaModelWithBias biases after backward [Parameter containing:
tensor([-0.2255], requires_grad=True), Parameter containing:
tensor([-2.2697], requires_grad=True), Parameter containing:
tensor([-0.7426], requires_grad=True), Parameter containing:
tensor([0.0925], requires_grad=True), Parameter containing:
tensor([-0.0564], requires_grad=True)] [None, None, None, None, None]

感谢您的宝贵时间!

python pytorch parameters backpropagation autograd
1个回答
0
投票

在您提出的版本中,有两个问题表明您的

biases
没有获得任何渐变。首先,您使用的是
torch.no_grad
,这意味着不允许进行梯度计算,因此在此范围内,所有
requires_grad
标志均按设计设置为
False
。即使删除它,您也会故意绕过梯度计算,因为参数是通过其
data
属性覆盖的,这意味着不会跟踪梯度!

更不用说您正在累积偏差,即。经过 10 次迭代后,您将得到

params += 10*biases
,这似乎是不正确的。换句话说,在您的实现中,没有“重置”参数...

现在,当然,您可能这样做是为了克服可怕的 “叶变量在就地操作中使用” 错误。当您尝试就地操作参数时会发生这种情况,因此会失败:

param += self.biases[task_id]

将其放错位置是没有意义的,因为

param
只是一个作用域变量,底层参数保持不变。

经过思考,似乎使梯度跟踪回到添加偏差的唯一方法是通过“功能”方法手动应用该层。您无需在每次迭代时修改参数,而是在应用之前进行修改(这解决了上面以粗体突出显示的问题)。 这是一种可能的实现,可确保对偏差进行梯度计算:

class MetaModelWithBias(nn.Module): def __init__(self, meta_model, num_tasks): super(MetaModelWithBias, self).__init__() self.meta_model = meta_model self.biases = nn.ParameterList([ nn.Parameter(torch.randn(1)) for _ in range(num_tasks)]) def forward(self, x, task_id): output = self.meta_model(x, self.biases[task_id]) return output class MetaModel(nn.Module): def __init__(self): super(MetaModel, self).__init__() self.fc = nn.Linear(10, 1) def forward(self, x, bias): return F.linear(x, self.fc.weight+bias, self.fc.bias+bias)

© www.soinside.com 2019 - 2024. All rights reserved.