如何将单个图像输入到 pytorch CNN 中?

问题描述 投票:0回答:1

出于某种原因,我无法将一张图像输入 pytorch 中的 CNN。

我训练它并在测试集上对其进行了测试,但是当我尝试将新图像输入其中时,网络中的维度不再匹配。

tf = transforms.Compose([transforms.ToTensor(),
                         transforms.Resize((32,32)),
                         transforms.Normalize(mean = (0.5, 0.5, 0.5), std = (0.5, 0.5, 0.5))
                         ])

dataset = ImageFolder(path, transform=tf)

train_size = int(0.8 * len(dataset))
test_size = len(dataset) - train_size
train_dataset, test_dataset = torch.utils.data.random_split(dataset, [train_size, test_size])

batch_size = 4
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size)

这是我用来加载数据的。然后我训练了一个具有以下架构的模型:

class CNN(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.model = torch.nn.Sequential(
            #Input = 3 x 32 x 32, Output = 32 x 32 x 32
            torch.nn.Conv2d(in_channels = 3, out_channels = 32, kernel_size = 3, padding = 1), 
            torch.nn.ReLU(),
            #Input = 32 x 32 x 32, Output = 32 x 16 x 16
            torch.nn.MaxPool2d(kernel_size=2),
  
            #Input = 32 x 16 x 16, Output = 64 x 16 x 16
            torch.nn.Conv2d(in_channels = 32, out_channels = 64, kernel_size = 3, padding = 1),
            torch.nn.ReLU(),
            #Input = 64 x 16 x 16, Output = 64 x 8 x 8
            torch.nn.MaxPool2d(kernel_size=2),
              
            #Input = 64 x 8 x 8, Output = 64 x 8 x 8
            torch.nn.Conv2d(in_channels = 64, out_channels = 64, kernel_size = 3, padding = 1),
            torch.nn.ReLU(),
            #Input = 64 x 8 x 8, Output = 64 x 4 x 4
            torch.nn.MaxPool2d(kernel_size=2),
  
            torch.nn.Flatten(),
            torch.nn.Linear(64*4*4, 512),
            torch.nn.ReLU(),
            torch.nn.Linear(512, 10)
        )
  
    def forward(self, x):
        return self.model(x)

在测试集上测试模型时使用如下代码:

test_acc=0
model.eval()
  
with torch.no_grad():
    #Iterating over the training dataset in batches
    for i, (images, labels) in enumerate(test_loader):
          
        images = images.to(device)
        y_true = labels.to(device)
          
        #Calculating outputs for the batch being iterated
        outputs = model(images)
          
        #Calculated prediction labels from models
        _, y_pred = torch.max(outputs.data, 1)
          
        #Comparing predicted and true labels
        test_acc += (y_pred == y_true).sum().item()
      
    print(f"Test set accuracy = {100 * test_acc / len(test_dataset)} %")

效果很好。但是当尝试使用以下代码提供一张图片时:

path = "C:/Users/nyden/new_image.jpg"

tf = transforms.Compose([transforms.ToTensor(),
                         transforms.Resize((32,32)),
                         transforms.Normalize(mean = (0.5, 0.5, 0.5), std = (0.5, 0.5, 0.5))
                         ])

img = Image.open(path)
img_tf = tf(img).float()
model.eval()
with torch.no_grad():
    out = model.forward(img_tf)
    _, y_pred = torch.max(out.data, 1)
    print(y_pred)

我刚得到错误: 运行时错误:mat1 和 mat2 形状不能相乘(64x16 和 1024x512)

我不明白为什么当我只输入一张而不是一批时尺寸是错误的? 任何帮助将不胜感激。

python pytorch conv-neural-network torch torchvision
1个回答
0
投票

您需要以

(batch_size, C, H, W)
的形状输入图像。如果是一张图片,你需要添加额外的暗淡,所以它的形状是
(1, C, H, W)
,如下所示:

img_tf = tf(img).float()[None,...]

img_tf = tf(img).float().unsqueeze(0)
© www.soinside.com 2019 - 2024. All rights reserved.