我希望将多边形标签从源图像传播到目标图像。目标图像只是源图像,但略有翻译。我发现这个代码片段允许我将源图像注册到目标图像。如果你把它写成一个函数,它就变成:
import numpy as np
import cv2
def register_images(
align: np.ndarray,
reference: np.ndarray,
):
"""
Registers two RGB images with each other.
Args:
align: Image to be aligned.
reference: Reference image to be used for alignment.
Returns:
Registered image and transformation matrix.
"""
# Convert to grayscale if needed
_align = align.copy()
_reference = reference.copy()
if _align.shape[-1] == 3:
_align = cv2.cvtColor(_align, cv2.COLOR_RGB2GRAY)
if _reference.shape[-1] == 3:
_reference = cv2.cvtColor(_reference, cv2.COLOR_RGB2GRAY)
height, width = _reference.shape
# Create ORB detector with 5000 features
orb_detector = cv2.ORB_create(500)
# Find the keypoint and descriptors
# The first arg is the image, second arg is the mask (not required in this case).
kp1, d1 = orb_detector.detectAndCompute(_align, None)
kp2, d2 = orb_detector.detectAndCompute(_reference, None)
# Match features between the two images
# We create a Brute Force matcher with Hamming distance as measurement mode.
matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
# Match the two sets of descriptors
matches = list(matcher.match(d1, d2))
# Sort matches on the basis of their Hamming distance and select the top 90 % matches forward
matches.sort(key=lambda x: x.distance)
matches = matches[:int(len(matches) * 0.9)]
no_of_matches = len(matches)
# Define empty matrices of shape no_of_matches * 2
p1 = np.zeros((no_of_matches, 2))
p2 = np.zeros((no_of_matches, 2))
for i in range(len(matches)):
p1[i, :] = kp1[matches[i].queryIdx].pt
p2[i, :] = kp2[matches[i].trainIdx].pt
# Find the homography matrix and use it to transform the colored image wrt the reference
homography, mask = cv2.findHomography(p1, p2, cv2.RANSAC)
transformed_img = cv2.warpPerspective(align, homography, (width, height))
return transformed_img, homography
现在,我可以访问变换后的图像和用于对齐两个图像的单应矩阵。我不明白该怎么做是如何将相同的转换也应用于用于注释图像的多边形和边界框。
特别是,注释采用 COCO 格式,这意味着您可以按如下方式访问坐标:
x0, y0, width, height = bounding_box
注释是多边形坐标列表:
segmentations = [poly1, poly2, poly3, ...] # segmentations are a list of polygons
for poly in segmentations:
x_coords = poly[0::2] # x coordinates are integer values on the even index in the poly list
y_coords = poly[1::2] # y coordinates are integer values on the odd index in the poly list
访问 x 和 y 坐标后,如何应用单应矩阵?
无论您开始使用什么类型的盒子,您都必须计算它的角点。现在它是一个多边形。
给定任何多边形,将其与单应性矩阵一起通过
perspectiveTransform()
。您可能需要(也可能不需要)反转单应性(取决于您的计算方式)。 np.linalg.inv()
就是这样做的。如果像这样变换轴对齐的边界框,由于单应性(透视、剪切、旋转……),它可能不再是轴对齐的。如果您需要在变换后的框或多边形周围有一个轴对齐的框,请在点集上调用 boundingRect()
。