我正在构建一个基于物理的神经网络来近似偏微分方程。我仅使用 adam 优化器就得到了不错的结果,但我想获得更好的结果。我尝试使用 adam 进行 10,000 次迭代,然后使用 L-BFGS 优化器 (pytorch) 进行最后 1,000 次迭代。
然而,当使用我的 L-BFGS 优化器时,网络的损耗永远不会改变并保持不变。这是我的 PINN 中用于 L-BFGS 的闭包函数
def closure(self):
lbfgs_optim.zero_grad()
train_loss = PINN.loss(xt_train_ICBC, u_train_ICBC, xt_resid, f_hat_train)
train_loss.backward()
return train_loss
还有我的 adam 和 L-BFGS 优化器参数
epochs_adam, epochs_lbfgs = 10000, 1000
adam_optim = torch.optim.Adam(PINN.parameters(), lr=lr, weight_decay=1e-5)
lbfgs_optim = torch.optim.LBFGS(PINN.parameters(), lr=lr, history_size = 20,
max_iter = 50, line_search_fn = "strong_wolfe")
我对 adam 使用一个 for 循环,然后对 L-BFGS 使用另一个 for 循环,这是我的代码中如何使用 adam 的,它有效
train_loss = PINN.loss(xt_train_ICBC, u_train_ICBC, xt_resid, f_hat_train)
... # print loss's, append to lists
adam_optim.zero_grad()
train_loss.backward()
adam_optim.step()
然后对于我的 L-BFGS,它似乎不起作用,我在纪元循环中调用的是
lbfgs_optim.step(PINN.closure)
我没有看到损失有任何变化,这是为什么?
正在使用的版本:Python 3.9.12、PyTorch 1.11.0 和 NumPy 1.21.5
编辑:PINN 代码和优化/训练代码
class NN(nn.Module):
# Heat Equation PDE
def __init__(self, layers):
super().__init__()
self.activation = nn.Sigmoid()
self.loss_function = nn.MSELoss(reduction='mean')
self.linears = nn.ModuleList([nn.Linear(layers[i], layers[i+1]) for i in range(len(layers)-1)])
for i in range(len(layers)-1):
nn.init.xavier_normal_(self.linears[i].weight.data, gain=1.0)
nn.init.zeros_(self.linears[i].bias.data)
def forward(self, x):
a = x.float()
for i in range(0, len(layers)-2):
z = self.linears[i](a)
a = self.activation(z)
a = self.linears[-1](a)
return a
def lossICBC(self, x_ICBC, u_ICBC):
"""MSE losses for oundary and initial conditions"""
loss_ICBC = self.loss_function(self.forward(x_ICBC), u_ICBC)
return loss_ICBC
def lossPDE(self, xt_residual, f_hat):
"""Residual loss for collocation points"""
g = xt_residual.clone().float()
g.requires_grad=True
f = self.forward(g)
f_xt = autograd.grad(f, g, torch.ones(g.shape[0], 1).to(device), create_graph=True)[0]
f_xx_tt = autograd.grad(f_xt, g, torch.ones(g.shape).to(device), create_graph=True)[0]
f_t = f_xt[:,[1]] # extract just the t values
f_xx = f_xx_tt[:,[0]] # extract just the x values
f = f_t - k*f_xx
return self.loss_function(f, f_hat)
def closure(self):
lbfgs_optim.zero_grad()
train_loss = PINN.loss(xt_train_ICBC, u_train_ICBC, xt_resid, f_hat_train)
train_loss.backward()
return train_loss
def loss(self, x_ICBC, u_ICBC, xt_residual, f_hat):
"""Total loss"""
loss_ICBC = self.lossICBC(x_ICBC, u_ICBC)
loss_PDE = self.lossPDE(xt_residual, f_hat) #f_hat=torch.zeros()
return loss_PDE + loss_ICBC
lr_adam = 0.001
lr_lbfgs = 1
epochs_adam = 20000
adam_optim = torch.optim.Adam(PINN.parameters(), lr=lr)
epochs_lbfgs = 100
lbfgs_optim = torch.optim.LBFGS(PINN.parameters(), lr=15, history_size = 20,
max_iter = 50, line_search_fn = "strong_wolfe")
训练循环
for i in range(0, epochs_adam+1):
train_loss = PINN.loss(xt_train_ICBC, u_train_ICBC, xt_resid, f_hat_train)
adam_optim.zero_grad()
train_loss.backward()
adam_optim.step()
for i in range(0, epochs_lbfgs+1):
train_loss = lbfgs_optim.step(PINN.closure)
您的问题是如何解决的?我面临着同样的问题。您可以通过此电子邮件“[电子邮件受保护]”与我联系吗?