我试图建立一个数据加载器,这就是它的样子
`class WhaleData(Dataset):
def __init__(self, data_file, root_dir , transform = None):
self.csv_file = pd.read_csv(data_file)
self.root_dir = root_dir
self.transform = transforms.Resize(224)
def __len__(self):
return len(os.listdir(self.root_dir))
def __getitem__(self, index):
image = os.path.join(self.root_dir, self.csv_file['Image'][index])
image = Image.open(image)
image = self.transform(image)
image = np.array(image)
label = self.csv_file['Image'][index]
sample = {'image': image, 'label':label}
return sample
trainset = WhaleData(data_file = '/mnt/55-91e8-b2383e89165f/Ryan/1234/train.csv',
root_dir = '/mnt/4d55-91e8-b2383e89165f/Ryan/1234/train')
train_loader = torch.utils.data.DataLoader(trainset , batch_size = 4, shuffle =True,num_workers= 2)
for i, batch in enumerate(train_loader):
(i, batch)
当我尝试运行这段代码时,我得到了这个错误,我确实得到了错误的本质,我的所有图像可能都不是相同的形状,而且我的图像并非都是相同的形状,但如果我没有错只有在我将它们送到网络时才会出现错误,因为图像都是不同的形状,但为什么它会在这里抛出错误呢?关于我可能出错的地方的任何建议都将极为有用,如果需要,我很乐意提供任何额外的信息,
谢谢
RuntimeError: Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 42, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 116, in default_collate
return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 116, in <dictcomp>
return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 105, in default_collate
return torch.stack([torch.from_numpy(b) for b in batch], 0)
File "/usr/local/lib/python3.5/dist-packages/torch/functional.py", line 64, in stack
return torch.cat(inputs, dim)
RuntimeError: inconsistent tensor sizes at /pytorch/torch/lib/TH/generic /THTensorMath.c:2864
当PyTorch尝试将图像堆叠成单个批量张量时(参见跟踪中的torch.stack([torch.from_numpy(b) for b in batch], 0)
),会出现错误。正如你所提到的,由于图像具有不同的形状,堆叠失败(即,如果所有这些张量都具有形状(B, H, W)
,则只能通过堆叠B
张量来创建张量(H, W)
)。
注意:我不完全确定,但为batch_size=1
设置torch.utils.data.DataLoader(...)
可能会删除此特定错误,因为它可能不再需要调用torch.stack()
)。