我想将没有边界框的图像添加到我的数据集中。
当我添加没有 xml 文件的图像时,出现此错误。
ValueError Traceback (most recent call last)
Input In [14], in <module>
4 torch.cuda.empty_cache()
6 for epoch in range(num_epochs):
7 # training for one epoch
----> 8 train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=10)
9 # update the learning rate
10 lr_scheduler.step()
File /notebooks/ml639a/pt651m/engine.py:31, in train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq, scaler)
29 targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
30 with torch.cuda.amp.autocast(enabled=scaler is not None):
---> 31 loss_dict = model(images, targets)
32 losses = sum(loss for loss in loss_dict.values())
34 # reduce losses over all GPUs for logging purposes
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1110, in Module._call_impl(self, *input, **kwargs)
1106 # If we don't have any hooks, we want to skip the rest of the logic in
1107 # this function, and just call forward.
1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1109 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110 return forward_call(*input, **kwargs)
1111 # Do not call functions when jit is used
1112 full_backward_hooks, non_full_backward_hooks = [], []
...
---> 68 raise ValueError(f"Expected target boxes to be a tensor of shape [N, 4], got {boxes.shape}.")
69 else:
70 raise ValueError(f"Expected target boxes to be of type Tensor, got {type(boxes)}.")
ValueError: Expected target boxes to be a tensor of shape [N, 4], got torch.Size([0]).
我看到了这个。 https://github.com/pytorch/vision/releases/tag/v0.6.0
现在可以将训练图像提供给 Faster / Mask / Keypoint R-CNN 不包含任何正面注释。这使得 训练时增加负样本的数量。对于那些 图像,注释期望张量的数量为 0 物体尺寸,...
这里提到了这一点.. https://github.com/pytorch/vision/issues/2144 和这里.. https://discuss.pytorch.org/t/can-i-feed-a-model-with-some-background-only-images/76279/6
这是我的
__getitem__
,基于上面的参考资料。
def __getitem__(self, idx):
img_name = self.imgs[idx]
image_path = os.path.join(self.files_dir, img_name)
...
# annotation file
annot_filename = img_name[:-4] + '.xml'
annot_file_path = os.path.join(self.files_dir, annot_filename)
boxes = []
labels = []
# if there is an xml file then parse it, otherwise
if os.path.exists(annot_file_path):
tree = et.parse(annot_file_path)
root = tree.getroot()
# cv2 image gives size as height x width
wt = img.shape[1]
ht = img.shape[0]
# box coordinates for xml files are extracted and corrected for image size given
for member in root.findall('object'):
labels.append(self.classes.index(member.find('name').text))
# bounding box
xmin = int(member.find('bndbox').find('xmin').text)
xmax = int(member.find('bndbox').find('xmax').text)
ymin = int(member.find('bndbox').find('ymin').text)
ymax = int(member.find('bndbox').find('ymax').text)
xmin_corr = (xmin/wt)*self.width
xmax_corr = (xmax/wt)*self.width
ymin_corr = (ymin/ht)*self.height
ymax_corr = (ymax/ht)*self.height
boxes.append([xmin_corr, ymin_corr, xmax_corr, ymax_corr])
# convert boxes into a torch.Tensor
boxes = torch.as_tensor(boxes, dtype=torch.float32)
# getting the areas of the boxes
area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
# suppose all instances are not crowd
iscrowd = torch.zeros((boxes.shape[0],), dtype=torch.int64)
labels = torch.as_tensor(labels, dtype=torch.int64)
target = {}
target["boxes"] = boxes
target["labels"] = labels
target["area"] = area
target["iscrowd"] = iscrowd
image_id = torch.tensor([idx])
target["image_id"] = image_id
else:
image_id = torch.tensor([idx])
target = {"boxes": torch.zeros((0, 4), dtype=torch.float32),
"labels": torch.zeros(0, dtype=torch.int64),
"image_id": torch.tensor([idx]),
"area": torch.zeros(0, dtype=torch.float32),
"iscrowd": torch.zeros((0,), dtype=torch.int64)}
if self.transforms:
sample = self.transforms(image = img_res,
bboxes = target['boxes'],
labels = labels)
img_res = sample['image']
target['boxes'] = torch.Tensor(sample['bboxes'])
return img_res, target
所有代码都在这里: https://github.com/dgleba/r655q/blob/main/negim/pt651m_ir4f_gi-negim.ipynb
任何人都可以看到我做错了什么吗?
尝试为没有框的图像设置
bboxes = torch.zeros(0,4)
。它对我有用。
torch.Tensor(boxes).reshape(-1, 4) 将更改为预期尺寸。