pytorch CNN 模型参数未正确存储

问题描述 投票:0回答:1

工作中的任务是制作一个CNN模型来对图像进行一些分类任务。另外,我应该能够在对图像进行分类后查看特征图,即应用卷积或池化操作后获得的图像。以下是我如何定义 CNN 类:

class ConvNet(nn.Module):
  def __init__(self, input_channels, output_dim):
    super().__init__()
    # input 48x48
    self.architecture = {
        "conv1": self.convblock(input_channels, 128, (3,3)), # 46x46
        "conv2" : self.convblock(128, 64, (3, 3), bnorm=True), # 44x44
        "pool1" : self.poolblock((2,2)), # 22x22
        "conv3" : self.convblock(64, 16, (3,3), stride=2), #10x10
        "conv4" : self.convblock(16, 10, (3,3)), # 8x8
        "pool2" : self.poolblock((2,2), bnorm=10), # 4x4
        "feedforward" : nn.Sequential(
          nn.Flatten(), # 4x4x10 = 160
          nn.Linear(160, 128), # 128
          nn.ReLU(inplace=True),
          nn.Dropout(0.3),
          nn.Linear(128, output_dim), # 3
          nn.Softmax(dim=1)
        )                  
    }
    self.maps = {}

  def forward(self, x):
    image = x
    for name, layer in self.architecture.items():
      out = layer(image)
      self.maps[name] = out
      image = out
    return image
     
  def convblock(self, inp, out, kernel, stride=1, bnorm=False):
    if bnorm:
      return nn.Sequential(
        nn.Conv2d(inp, out, kernel, stride=stride),
        nn.ReLU(inplace=True),
        nn.BatchNorm2d(out)
      )
    else:
      return nn.Sequential(
        nn.Conv2d(inp, out, kernel, stride=stride),
        nn.ReLU(inplace=True)
      )

  def poolblock(self, kernel, bnorm=None):
    if bnorm is None:
      return nn.MaxPool2d(kernel)
    else:
      return nn.Sequential(
          nn.MaxPool2d(kernel),
          nn.BatchNorm2d(bnorm)
      )

  def get_map(self, im, layer):
    fig, ax = plt.subplots(1,2, figsize=(20,10), gridspec_kw={'width_ratios': [1,3]})
    ax[0].set_xticks([])
    ax[0].set_yticks([])
    ax[0].imshow(im.reshape(im.shape[-2],im.shape[-1],1), cmap="gray") # Shows Input image
    self(im)
    map = self.maps[layer]
    map=map.reshape(map.shape[1],1,map.shape[-2],map.shape[-1])
    ax[1].set_xticks([])
    ax[1].set_yticks([])
    rows = max(int(map.shape[0]/8), 8)
    ax[1].imshow(make_grid(map,nrow=rows).permute(1, 2, 0)) # Shows all the channels after an operation.

这个想法是将卷积层和池化层块存储在

self.architecture
字典中,名称为
'conv1'
'conv2'
'pool1'
等...... 然后,在前向方法中,我将通过每个块运行输入图像,并将每个块的输出存储在
self.maps
字典中以便稍后检索(
self.get_map
就是这样做的)。

问题在于模型的参数设置不正确。下面是我实例化模型和优化器的代码:

model = ConvNet(1, 3).to(device)
adam_opt = torch.optim.Adam(model.parameters(), lr=learning_rate)

但是我遇到了以下错误:

/usr/local/lib/python3.10/dist-packages/torch/optim/optimizer.py in __init__(self, params, defaults)
    271         param_groups = list(params)
    272         if len(param_groups) == 0:
--> 273             raise ValueError("optimizer got an empty parameter list")
    274         if not isinstance(param_groups[0], dict):
    275             param_groups = [{'params': param_groups}]

ValueError: optimizer got an empty parameter list

我打印出参数列表,里面是空的。我不明白为什么会这样。

我定义架构或前向方法的方式是否错误,PyTorch 在子类化时期望一些特定行为

nn.Module
?如果是这样,那是什么?我该如何更改我的班级?任何关于 Pytorch 如何实际存储类参数的额外信息都是最受欢迎的。

python pytorch
1个回答
0
投票

在 PyTorch 中定义层字典的正确方法是用 nn.ModuleDict

 包装 
dict:

self.architecture = nn.ModuleDict({
    "conv1": self.convblock(input_channels, 128, (3,3)), # 46x46
    "conv2" : self.convblock(128, 64, (3, 3), bnorm=True), # 44x44
    "pool1" : self.poolblock((2,2)), # 22x22
    "conv3" : self.convblock(64, 16, (3,3), stride=2), #10x10
    "conv4" : self.convblock(16, 10, (3,3)), # 8x8
    "pool2" : self.poolblock((2,2), bnorm=10), # 4x4
    "feedforward" : nn.Sequential(
        nn.Flatten(), # 4x4x10 = 160
        nn.Linear(160, 128), # 128
        nn.ReLU(inplace=True),
        nn.Dropout(0.3),
        nn.Linear(128, output_dim), # 3
        nn.Softmax(dim=1)
    )                  
})

您可以通过计数来测试参数是否已注册:

>>> nn.utils.parameters_to_vector(net.parameters()).numel()
106897
© www.soinside.com 2019 - 2024. All rights reserved.