我正在进行一项实验,我有 3 个固定摄像机同步记录一个人在有限空间(完全被每个摄像机的视野覆盖)中进化。我使用预训练的 AI 模型在每一帧检测 20 多个姿势地标,代表每个视频中的数百个 2D 点。我的目标是使用 OpenCV (Python) 将这些地标表示为 3D 点云。 我知道这在使用 cv2.triangulatePoints() 的 2 个相机的情况下是可行的。我使用这种方法为每个可能的相机对(相机 1 - 相机 2、相机 2 - 相机 3,...)获取一个 3D 点集。但是这些集合似乎不能轻易合并在一起,因为它们是在不同的坐标系中表示的。 我如何使用来自 3 个摄像机的数据而不是任意一对来“三角化”3D 点?
我尝试使用通过 cv2.recoverPose() 获得的外部矩阵来合并 3D 集,只是意识到这是不可行的。然后我运行了 cv2.solvePnP() 并得到了旋转和平移矩阵,但不知道它们到底是什么。
这是我使用的代码类型:
def getProjectionMat(pts_l, pts_r, K_l, K_r):
pts_l_norm = cv2.undistortPoints(np.expand_dims(pts_l, axis=1), cameraMatrix=K_l, distCoeffs=None)
pts_r_norm = cv2.undistortPoints(np.expand_dims(pts_r, axis=1), cameraMatrix=K_r, distCoeffs=None)
E, mask = cv2.findEssentialMat(pts_l_norm, pts_r_norm, focal=1.0, pp=(0., 0.), method=cv2.RANSAC, prob=0.999, threshold=3.0)
points, R, t, mask = cv2.recoverPose(E, pts_l_norm, pts_r_norm)
M_r = np.hstack((R, t))
M_l = np.hstack((np.eye(3, 3), np.zeros((3, 1))))
P_l = np.dot(K_l, M_l)
P_r = np.dot(K_r, M_r)
return P_l, P_r, R, t
def get3DPoints(points1, points2, P1, P2):
point_4d_hom = cv2.triangulatePoints(P1, P2, np.expand_dims(points1, axis=1), np.expand_dims(points2, axis=1))
point_4d = point_4d_hom / np.tile(point_4d_hom[-1, :], (4, 1))
return point_4d[:3, :].T
#ptpL1 are 2D points of camera 1, ptpL2 are corresponding points in camera 2, etc.
#K1, K2, etc. are intrinsic matrices
P1_1, P2_1, xR1, xt1=getProjectionMat(ptpL1, ptpL2, K1, K2) #projection between cams 1 and 2
P1_2, P2_2, xR2, xt2=getProjectionMat(ptpl2, ptpl3, K2, K3) #... between cams 2 and 3
P1_3, P2_3, xR3, xt3=getProjectionMat(ptpl1, ptpL3, K1, K3) #... between cams 1 and 3
#get2DPointForCam(x, markOfInterest) returns the 2D coordinates of the givent landmark for cam x
for markOfInterest in marksOfInterest:
point2D1=get2DPointForCam(1, markOfInterest)
point2D2=get2DPointForCam(2, markOfInterest)
point2D3=get2DPointForCam(3, markOfInterest)
point3D1=get3DPoints(point2D1, point2D2, P1_1, P2_2)[0]
point3D2=get3DPoints(point2D2, point2D3, P1_2, P2_2)[0]
point3D3=get3DPoints(point2D1, point2D3, P1_3, P2_3)[0]
#problem : point3D1 and point3D3 are close, but point3D2 has a completely different value. Still, when
projected onto camera2, it is well positionned.