在图像上运行推理(对象检测)时了解输出字典中的“检测框”

问题描述 投票:0回答:0

这里是取自Tensorflow 教程 的代码片段,用于将图像放入模型并获取包含模型推理结果的输出字典。

    # The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
    input_tensor = tf.convert_to_tensor(image)
    # The model expects a batch of images, so add an axis with `tf.newaxis`.
    input_tensor = input_tensor[tf.newaxis,...]

    # Run inference
    model_fn = self.model.signatures['serving_default']
    output_dict = model_fn(input_tensor)

    # All outputs are batches tensors.
    # Convert to numpy arrays, and take index [0] to remove the batch dimension.
    # We're only interested in the first num_detections.
    num_detections = int(output_dict.pop('num_detections'))
    output_dict = {key:value[0, :num_detections].numpy()
                    for key,value in output_dict.items()}
    output_dict['num_detections'] = num_detections

    # detection_classes should be ints.
    output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)

    # Handle models with masks:
    if 'detection_masks' in output_dict:
        # Reframe the the bbox mask to the image size.
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
                  output_dict['detection_masks'], output_dict['detection_boxes'],
                   image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,
                                           tf.uint8)
        output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()
    return output_dict

一直在打印output_dict,一直试图理解它输出的所有信息。当我查看“detection_boxes”时,我会发现它包含其各自边界框的 x 和 y 值的最大值和最小值。但是,我注意到它们的值总是非常小的十进制数。我不完全确定为什么会这样,希望这里有人可以帮助我理解这一点。对此事的任何意见表示赞赏。

python tensorflow object-detection
© www.soinside.com 2019 - 2024. All rights reserved.