RuntimeError：张量的形状与图像掩蔽不匹配

Question

我正在尝试更改代码（https://github.com/Microsoft/singleshotpose）它是一个 CNN，它使用 9 个关键点，8 个来自边界框和一个质心来训练网络寻找这些点。我想尝试仅在 4 个关键点而不是原始代码中的 9 个关键点上训练图像。我试图一步一步地更改代码，但现在我遇到了一个我无法解决的错误。

我将给出损失函数的代码片段：

def build_targets(pred_corners, target, num_keypoints, num_anchors, num_classes, nH, nW, noobject_scale, object_scale, sil_thresh, seen):
    nB = target.size(0)
    nA = num_anchors
    nC = num_classes
    conf_mask   = torch.ones(nB, nA, nH, nW) * noobject_scale
    coord_mask  = torch.zeros(nB, nA, nH, nW)
    cls_mask    = torch.zeros(nB, nA, nH, nW)
    txs = list()
    tys = list()
    for i in range(num_keypoints):
        txs.append(torch.zeros(nB, nA, nH, nW))
        tys.append(torch.zeros(nB, nA, nH, nW)) 
    tconf = torch.zeros(nB, nA, nH, nW)
    tcls  = torch.zeros(nB, nA, nH, nW) 

    num_labels = 2 * num_keypoints + 3 # +2 for width, height and +1 for class within label files
    nAnchors = nA*nH*nW
    nPixels  = nH*nW
    for b in range(nB):
        cur_pred_corners = pred_corners[b*nAnchors:(b+1)*nAnchors].t()
        cur_confs = torch.zeros(nAnchors)
        for t in range(50):
            if target[b][t*num_labels+1] == 0:
                break
            g = list()
            for i in range(num_keypoints):
                g.append(target[b][t*num_labels+2*i+1])
                g.append(target[b][t*num_labels+2*i+2])

            cur_gt_corners = torch.FloatTensor(g).repeat(nAnchors,1).t() # 16 x nAnchors
            cur_confs  = torch.max(cur_confs, corner_confidences(cur_pred_corners, cur_gt_corners)).view_as(conf_mask[b]) # some irrelevant areas are filtered, in the same grid multiple anchor boxes might exceed the threshold

        conf_mask[b][cur_confs>sil_thresh] = 0

我得到的错误如下：

File "c:\python_work\singleshotpose\region_loss2.py", line 56, in build_targets
cur_confs  = torch.max(cur_confs, corner_confidences(cur_pred_corners, cur_gt_corners)).view_as(conf_mask[b]) # some irrelevant areas are filtered, in the same grid multiple anchor boxes might exceed the threshold
RuntimeError: The size of tensor a (13) must match the size of tensor b (169) at non-singleton dimension 2

它告诉我张量的大小不同。据我所知，我正在使用 corner_confidences 函数来查找预测角点和真实角点之间的距离。之后它返回初始化置信度和那些新置信度之间的最大值。当 .view_as 被遗漏时，代码在第一次迭代时工作，并在与 sil_thresh 比较时在第二次停止。

这些是我打印 gt_corners、pr_corners 和 conf_mask 的 .size() 时的结果此外，我还添加了其他变量的大小，以便更好地理解：

nB = 8, nA = 1, nC = 1, nH = 13, nW = 13, num_keypoints = 4
cur_pred_corners=torch.Size([8, 169])
cur_gt_corners=torch.Size([8, 169])
conf_mask=torch.Size([8, 1, 13, 13])

另外，我将添加 corners_confidences 函数，以便更加清晰

def corner_confidences(gt_corners, pr_corners, th=80, sharpness=2, im_width=640, im_height=480):
    ''' gt_corners: Ground-truth 2D projections of the 3D bounding box corners, shape: (16 x nA), type: torch.FloatTensor
        pr_corners: Prediction for the 2D projections of the 3D bounding box corners, shape: (16 x nA), type: torch.FloatTensor
        th        : distance threshold, type: int
        sharpness : sharpness of the exponential that assigns a confidence value to the distance
        -----------
        return    : a torch.FloatTensor of shape (nA,) with 9 confidence values 
    '''
    print(f"gt_corners: {gt_corners.size()}")
    print(f"pr_corners: {pr_corners.size()}")
    shape = gt_corners.size()
    torch.set_printoptions(threshold=sys.maxsize)
    print(f"test: {shape}")
    nA = shape[1]  
    dist = gt_corners - pr_corners
    num_el = dist.numel()
    num_keypoints = num_el//(nA*2)
    dist = dist.t().contiguous().view(nA, num_keypoints, 2)
    dist[:, :, 0] = dist[:, :, 0] * im_width
    dist[:, :, 1] = dist[:, :, 1] * im_height

    eps = 1e-5
    distthresh = torch.FloatTensor([th]).repeat(nA, num_keypoints) 
    dist = torch.sqrt(torch.sum((dist)**2, dim=2)).squeeze() # nA x 9
    mask = (dist < distthresh).type(torch.FloatTensor)
    conf = torch.exp(sharpness*(1 - dist/distthresh))-1  # mask * (torch.exp(math.log(2) * (1.0 - dist/rrt)) - 1)
    conf0 = torch.exp(sharpness*(1 - torch.zeros(conf.size(0),1))) - 1
    conf = conf / conf0.repeat(1, num_keypoints)
    # conf = 1 - dist/distthresh
    conf = mask * conf  # nA x 9
    mean_conf = torch.mean(conf, dim=1)
    return mean_conf

我现在有这个错误几天了，只是找不到解决方案，希望有人能帮助我！谢谢大家

RuntimeError：张量的形状与图像掩蔽不匹配

问题描述投票：0回答：0

最新问题

RuntimeError：张量的形状与图像掩蔽不匹配

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0