L-BFGS 优化器没有改变损失，但 Adam 改变了

Question

我正在构建一个基于物理的神经网络来近似偏微分方程。我仅使用 adam 优化器就得到了不错的结果，但我想获得更好的结果。我尝试使用 adam 进行 10,000 次迭代，然后使用 L-BFGS 优化器 (pytorch) 进行最后 1,000 次迭代。

然而，当使用我的 L-BFGS 优化器时，网络的损耗永远不会改变并保持不变。这是我的 PINN 中用于 L-BFGS 的闭包函数

def closure(self):
    lbfgs_optim.zero_grad()  
    train_loss = PINN.loss(xt_train_ICBC, u_train_ICBC, xt_resid, f_hat_train)
    train_loss.backward()      
    return train_loss

还有我的 adam 和 L-BFGS 优化器参数

epochs_adam, epochs_lbfgs  = 10000, 1000
adam_optim  = torch.optim.Adam(PINN.parameters(), lr=lr, weight_decay=1e-5)
lbfgs_optim = torch.optim.LBFGS(PINN.parameters(), lr=lr, history_size = 20,
                                max_iter = 50, line_search_fn = "strong_wolfe")

我对 adam 使用一个 for 循环，然后对 L-BFGS 使用另一个 for 循环，这是我的代码中如何使用 adam 的，它有效

train_loss = PINN.loss(xt_train_ICBC, u_train_ICBC, xt_resid, f_hat_train)
... # print loss's, append to lists
adam_optim.zero_grad()
train_loss.backward()
adam_optim.step()

然后对于我的 L-BFGS，它似乎不起作用，我在纪元循环中调用的是

lbfgs_optim.step(PINN.closure)

我没有看到损失有任何变化，这是为什么？

正在使用的版本：Python 3.9.12、PyTorch 1.11.0 和 NumPy 1.21.5

编辑：PINN 代码和优化/训练代码

class NN(nn.Module):
    # Heat Equation PDE    
    def __init__(self, layers):
        super().__init__()
        self.activation = nn.Sigmoid()
        self.loss_function = nn.MSELoss(reduction='mean')
        self.linears = nn.ModuleList([nn.Linear(layers[i], layers[i+1]) for i in range(len(layers)-1)])

        for i in range(len(layers)-1):
            nn.init.xavier_normal_(self.linears[i].weight.data, gain=1.0)
            nn.init.zeros_(self.linears[i].bias.data)  
    
    def forward(self, x):        
        a = x.float()
        for i in range(0, len(layers)-2):
            z = self.linears[i](a)
            a = self.activation(z)
        a = self.linears[-1](a)
        return a
    
    def lossICBC(self, x_ICBC, u_ICBC):
        """MSE losses for oundary and initial conditions"""
        loss_ICBC = self.loss_function(self.forward(x_ICBC), u_ICBC)
        return loss_ICBC
    
    def lossPDE(self, xt_residual, f_hat):
        """Residual loss for collocation points"""
        g = xt_residual.clone().float()
        g.requires_grad=True

        f = self.forward(g)
        
        f_xt = autograd.grad(f, g, torch.ones(g.shape[0], 1).to(device), create_graph=True)[0]
        f_xx_tt = autograd.grad(f_xt, g, torch.ones(g.shape).to(device), create_graph=True)[0]
        
        f_t = f_xt[:,[1]] # extract just the t values
        f_xx = f_xx_tt[:,[0]] # extract just the x values
        
        f = f_t - k*f_xx 
        return self.loss_function(f, f_hat)

    def closure(self):
        lbfgs_optim.zero_grad()  
        train_loss = PINN.loss(xt_train_ICBC, u_train_ICBC, xt_resid, f_hat_train)
        train_loss.backward()      
        return train_loss
    
    def loss(self, x_ICBC, u_ICBC, xt_residual, f_hat):
        """Total loss"""
        loss_ICBC = self.lossICBC(x_ICBC, u_ICBC)
        loss_PDE  = self.lossPDE(xt_residual, f_hat) #f_hat=torch.zeros()
        return loss_PDE + loss_ICBC

lr_adam = 0.001
lr_lbfgs = 1
epochs_adam  = 20000
adam_optim   = torch.optim.Adam(PINN.parameters(), lr=lr)

epochs_lbfgs = 100
lbfgs_optim  = torch.optim.LBFGS(PINN.parameters(), lr=15, history_size = 20,
                                 max_iter = 50, line_search_fn = "strong_wolfe")

训练循环

for i in range(0, epochs_adam+1):

    train_loss = PINN.loss(xt_train_ICBC, u_train_ICBC, xt_resid, f_hat_train)
    adam_optim.zero_grad()
    train_loss.backward()
    adam_optim.step()

for i in range(0, epochs_lbfgs+1):
    train_loss = lbfgs_optim.step(PINN.closure)

Answer 1

您的问题是如何解决的？我面临着同样的问题。您可以通过此电子邮件“[电子邮件受保护]”与我联系吗？

L-BFGS 优化器没有改变损失，但 Adam 改变了

问题描述投票：0回答：1

1个回答

最新问题

L-BFGS 优化器没有改变损失，但 Adam 改变了

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1