在PyTorch神经网络中修正不正确的尺寸

问题描述 投票:0回答:1

我正在尝试训练我的用PyTorch编写的神经网络,但是由于尺寸错误,我得到了以下回溯。得到了以下回溯

Traceback (most recent call last):
  File "plot_parametric_pytorch.py", line 139, in <module>
    ops = opfun(X_train[smpl])
  File "plot_parametric_pytorch.py", line 92, in <lambda>
    opfun = lambda X: model.forward(Variable(torch.from_numpy(X)))
  File "/mnt_home/klee/LBSBGenGapSharpnessResearch/deepnet.py", line 77, in forward
    x = self.features(x)
  File "/home/klee/anaconda3/envs/sharpenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/klee/anaconda3/envs/sharpenv/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/klee/anaconda3/envs/sharpenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/klee/anaconda3/envs/sharpenv/lib/python3.7/site-packages/torch/nn/modules/pooling.py", line 141, in forward
    self.return_indices)
  File "/home/klee/anaconda3/envs/sharpenv/lib/python3.7/site-packages/torch/_jit_internal.py", line 209, in fn
    return if_false(*args, **kwargs)
  File "/home/klee/anaconda3/envs/sharpenv/lib/python3.7/site-packages/torch/nn/functional.py", line 539, in _max_pool2d
    input, kernel_size, stride, padding, dilation, ceil_mode)
RuntimeError: Given input size: (512x1x1). Calculated output size: (512x0x0). Output size is too small

这就是尝试运行前向通行证的全部。我很确定这是一个小错误,但是我本人是编写PyTorch代码的新手,所以不确定是否知道它在哪里。作为参考,当我使用model.summary()检查了Keras模型版本的尺寸时,在展平并添加密集层之前的最终尺寸(我认为应该发生在pytorch的self.classifier中,我也不确定)是512 x 1 x 1。

这是我在PyTorch中的模型:

class VGG(nn.Module):
    def __init__(self, num_classes=10):
        super(VGG, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, bias=False),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),
            nn.Dropout(0.3),
            nn.Conv2d(64, 64, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(64, 128, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(128),
            nn.ReLU(inplace=True),
            nn.Dropout(0.4),
            nn.Conv2d(128, 128, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(128),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(128, 256, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True),
            nn.Dropout(0.4),
            nn.Conv2d(256, 256, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True),
            nn.Dropout(0.4),
            nn.Conv2d(256, 256, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(256, 512, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.4),
            nn.Conv2d(512, 512, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.4),
            nn.Conv2d(512, 512, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(512),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(512, 512, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.4),
            nn.Conv2d(512, 512, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.4),
            nn.Conv2d(512, 512, kernel_size=3, padding = 1, bias=False),
            nn.BatchNorm2d(512),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
        )
        self.classifier = nn.Sequential(
            nn.Linear(512, 512, bias=False),
            nn.Dropout(0.5),
            nn.BatchNorm1d(512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.5),
            nn.Linear(512, num_classes)
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(-1, 512)
        x = self.classifier(x)
        return F.log_softmax(x)

def cifar10_deep(**kwargs):
    num_classes = getattr(kwargs, 'num_classes', 10)
    return VGG(num_classes)


def cifar100_deep(**kwargs):
    num_classes = getattr(kwargs, 'num_classes', 100)
    return VGG(num_classes)

而且我正在尝试运行以下代码:

cudnn.benchmark = True
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train = X_train.astype('float32')
X_train = np.transpose(X_train, axes=(0, 3, 1, 2))
X_test = X_test.astype('float32')
X_test = np.transpose(X_test, axes=(0, 3, 1, 2))
X_train /= 255
X_test /= 255
device = torch.device('cuda:0')

# This is where you can load any model of your choice.
# I stole PyTorch Vision's VGG network and modified it to work on CIFAR-10.
# You can take this line out and add any other network and the code
# should run just fine.
model = cifar_shallow.cifar10_shallow()
#model.to(device)

# Forward pass
opfun = lambda X: model.forward(Variable(torch.from_numpy(X)))

# Forward pass through the network given the input
predsfun = lambda op: np.argmax(op.data.numpy(), 1)

# Do the forward pass, then compute the accuracy
accfun = lambda op, y: np.mean(np.equal(predsfun(op), y.squeeze()))*100

# Initial point
x0 = deepcopy(model.state_dict())

# Number of epochs to train for
# Choose a large value since LB training needs higher values
# Changed from 150 to 30
nb_epochs = 30 
batch_range = [25, 40, 50, 64, 80, 128, 256, 512, 625, 1024, 1250, 1750, 2048, 2500, 3125, 4096, 4500, 5000]

# parametric plot (i.e., don't train the network if set to True)
hotstart = False

if not hotstart:
    for batch_size in batch_range:
        optimizer = torch.optim.Adam(model.parameters())
        model.load_state_dict(x0)
        #model.to(device)
        average_loss_over_epoch = '-'
        print('Optimizing the network with batch size %d' % batch_size)
        np.random.seed(1337) #So that both networks see same sequence of batches
        for e in range(nb_epochs):
            model.eval()
            print('Epoch:', e, ' of ', nb_epochs, 'Average loss:', average_loss_over_epoch)
            average_loss_over_epoch = 0

            # Checkpoint the model every epoch
            torch.save(model.state_dict(), "./models/ShallowNetCIFAR10BatchSize" + str(batch_size) + ".pth")
            array = np.random.permutation(range(X_train.shape[0]))
            slices = X_train.shape[0] // batch_size
            beginning = 0
            end = 1

            # Training loop!
            for _ in range(slices):
                start_index = batch_size * beginning 
                end_index = batch_size * end
                smpl = array[start_index:end_index]
                model.train()
                optimizer.zero_grad()
                ops = opfun(X_train[smpl]) <<----- error in this line
                tgts = Variable(torch.from_numpy(y_train[smpl]).long().squeeze())
                loss_fn = F.nll_loss(ops, tgts)
                average_loss_over_epoch += loss_fn.data.numpy() / (X_train.shape[0] // batch_size)
                loss_fn.backward()
                optimizer.step()
                beginning += 1
                end += 1

我想知道我的模型哪里出错了。我正在编写以下Keras模型的PyTorch版本。修复小错误的任何帮助将不胜感激!


def deepnet(nb_classes):
    global img_size
    model = Sequential()
    model.add(Conv2D(64, (3, 3), input_shape=img_size))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu'))
    model.add(Dropout(0.3))
    model.add(Conv2D(64, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))




    model.add(Conv2D(128, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu')); model.add(Dropout(0.4))
    model.add(Conv2D(128, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))



    model.add(Conv2D(256, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu')); model.add(Dropout(0.4))
    model.add(Conv2D(256, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu')); model.add(Dropout(0.4))
    model.add(Conv2D(256, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))



    model.add(Conv2D(512, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu')); model.add(Dropout(0.4))
    model.add(Conv2D(512, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu')); model.add(Dropout(0.4))
    model.add(Conv2D(512, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))



    model.add(Conv2D(512, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu')); model.add(Dropout(0.4))
    model.add(Conv2D(512, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu')); model.add(Dropout(0.4))
    model.add(Conv2D(512, (3, 3), padding='same'))
    model.add(BatchNormalization(axis=1))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))


    model.add(Flatten()); model.add(Dropout(0.5))
    model.add(Dense(512))
    model.add(BatchNormalization())
    model.add(Activation('relu')); model.add(Dropout(0.5))
    model.add(Dense(nb_classes, activation='softmax'))
    return model

[请让我知道将神经网络模型从PyTorch转换为Keras的方式是否存在问题。据我了解,由于pyras中的padding = same设置相同,pytorch中的padding应该始终等于1。

我正在尝试训练我的用PyTorch编写的神经网络,但是由于尺寸错误,我得到了以下回溯。获得了以下回溯Traceback(最近一次通话):...

python tensorflow keras neural-network pytorch
1个回答
0
投票

第一个卷积不使用填充。

© www.soinside.com 2019 - 2024. All rights reserved.