我目前正在研究我的项目,其中涉及车辆检测和跟踪以及估计和优化车辆周围的长方体。为此,我已经完成了车辆的检测和跟踪,我需要找到车辆边界框边缘的图像点的 3D 世界坐标,然后估计长方体和项目边缘的世界坐标它返回到图像以显示它。
所以,我是计算机视觉和 OpenCV 的新手,但据我所知,我只需要图像上的 4 个点,并且需要知道这 4 个点的世界坐标,并使用 OpenCV 中的solvePNP 来获取旋转和平移向量(我已经有了相机矩阵和畸变系数)。然后,我需要使用 Rodrigues 将旋转向量转换为旋转矩阵,然后将其与平移向量连接以获得我的外在矩阵,然后将外在矩阵与相机矩阵相乘以获得我的投影矩阵。由于我的 z 坐标为零,所以我需要从投影矩阵中去掉第三列,该矩阵给出了用于将 2D 图像点转换为 3D 世界点的单应性矩阵。现在,我找到单应性矩阵的逆,它给出了 3D 世界点到 2D 图像点之间的单应性。之后,我将图像点 [x, y, 1]t 与逆单应矩阵相乘以获得 [wX, wY, w]t 并将整个向量除以标量 w 以获得 [X, Y, 1]给我世界坐标的 X 和 Y 值。
我的代码如下所示:
#include "opencv2/opencv.hpp"
#include <stdio.h>
#include <iostream>
#include <sstream>
#include <math.h>
#include <conio.h>
using namespace cv;
using namespace std;
Mat cameraMatrix, distCoeffs, rotationVector, rotationMatrix,
translationVector,extrinsicMatrix, projectionMatrix, homographyMatrix,
inverseHomographyMatrix;
Point point;
vector<Point2d> image_points;
vector<Point3d> world_points;
int main()
{
FileStorage fs1("intrinsics.yml", FileStorage::READ);
fs1["camera_matrix"] >> cameraMatrix;
cout << "Camera Matrix: " << cameraMatrix << endl << endl;
fs1["distortion_coefficients"] >> distCoeffs;
cout << "Distortion Coefficients: " << distCoeffs << endl << endl;
image_points.push_back(Point2d(275, 204));
image_points.push_back(Point2d(331, 204));
image_points.push_back(Point2d(331, 308));
image_points.push_back(Point2d(275, 308));
cout << "Image Points: " << image_points << endl << endl;
world_points.push_back(Point3d(0.0, 0.0, 0.0));
world_points.push_back(Point3d(1.775, 0.0, 0.0));
world_points.push_back(Point3d(1.775, 4.620, 0.0));
world_points.push_back(Point3d(0.0, 4.620, 0.0));
cout << "World Points: " << world_points << endl << endl;
solvePnP(world_points, image_points, cameraMatrix, distCoeffs, rotationVector, translationVector);
cout << "Rotation Vector: " << endl << rotationVector << endl << endl;
cout << "Translation Vector: " << endl << translationVector << endl << endl;
Rodrigues(rotationVector, rotationMatrix);
cout << "Rotation Matrix: " << endl << rotationMatrix << endl << endl;
hconcat(rotationMatrix, translationVector, extrinsicMatrix);
cout << "Extrinsic Matrix: " << endl << extrinsicMatrix << endl << endl;
projectionMatrix = cameraMatrix * extrinsicMatrix;
cout << "Projection Matrix: " << endl << projectionMatrix << endl << endl;
double p11 = projectionMatrix.at<double>(0, 0),
p12 = projectionMatrix.at<double>(0, 1),
p14 = projectionMatrix.at<double>(0, 3),
p21 = projectionMatrix.at<double>(1, 0),
p22 = projectionMatrix.at<double>(1, 1),
p24 = projectionMatrix.at<double>(1, 3),
p31 = projectionMatrix.at<double>(2, 0),
p32 = projectionMatrix.at<double>(2, 1),
p34 = projectionMatrix.at<double>(2, 3);
homographyMatrix = (Mat_<double>(3, 3) << p11, p12, p14, p21, p22, p24, p31, p32, p34);
cout << "Homography Matrix: " << endl << homographyMatrix << endl << endl;
inverseHomographyMatrix = homographyMatrix.inv();
cout << "Inverse Homography Matrix: " << endl << inverseHomographyMatrix << endl << endl;
Mat point2D = (Mat_<double>(3, 1) << image_points[0].x, image_points[0].y, 1);
cout << "First Image Point" << point2D << endl << endl;
Mat point3Dw = inverseHomographyMatrix*point2D;
cout << "Point 3D-W : " << point3Dw << endl << endl;
double w = point3Dw.at<double>(2, 0);
cout << "W: " << w << endl << endl;
Mat matPoint3D;
divide(w, point3Dw, matPoint3D);
cout << "Point 3D: " << matPoint3D << endl << endl;
_getch();
return 0;
我已经获得了四个已知世界点的图像坐标,并将其硬编码以进行简化。
image_points
包含四个点的图像坐标,world_points
包含四个点的世界坐标。我将第一个世界点视为世界轴的原点 (0, 0, 0),并使用已知距离计算其他四个点的坐标。现在计算逆单应矩阵后,我将其与与世界坐标 (0, 0, 0) 相关的 [image_points[0].x, image_points[0].y, 1]t 相乘。然后我将结果除以第三个分量 w 得到 [X, Y, 1]。但打印出X和Y的值后,发现它们分别不是0、0。做错了什么?
我的代码的输出是这样的:
Camera Matrix: [517.0036881709533, 0, 320;
0, 517.0036881709533, 212;
0, 0, 1]
Distortion Coefficients: [0.1128663679798094;
-1.487790079922432;
0;
0;
2.300571896761067]
Image Points: [275, 204;
331, 204;
331, 308;
275, 308]
World Points: [0, 0, 0;
1.775, 0, 0;
1.775, 4.62, 0;
0, 4.62, 0]
Rotation Vector:
[0.661476468596541;
-0.02794460022559267;
0.01206996342819649]
Translation Vector:
[-1.394495345140898;
-0.2454153722672731;
15.47126945512652]
Rotation Matrix:
[0.9995533907649279, -0.02011656447351923, -0.02209848058392758;
0.002297501163799448, 0.7890323093017149, -0.6143474069013439;
0.02979497438726573, 0.6140222623910194, 0.7887261380159]
Extrinsic Matrix:
[0.9995533907649279, -0.02011656447351923, -0.02209848058392758,
-1.394495345140898;
0.002297501163799448, 0.7890323093017149, -0.6143474069013439,
-0.2454153722672731;
0.02979497438726573, 0.6140222623910194, 0.7887261380159,
15.47126945512652]
Projection Matrix:
[526.3071813531748, 186.086785938988, 240.9673682002232, 4229.846989065414;
7.504351145361707, 538.1053336219271, -150.4099339268854, 3153.028471890794;
0.02979497438726573, 0.6140222623910194, 0.7887261380159, 15.47126945512652]
Homography Matrix:
[526.3071813531748, 186.086785938988, 4229.846989065414;
7.504351145361707, 538.1053336219271, 3153.028471890794;
0.02979497438726573, 0.6140222623910194, 15.47126945512652]
Inverse Homography Matrix:
[0.001930136511648154, -8.512427241879318e-05, -0.5103513244724983;
-6.693679705844383e-06, 0.00242178892313387, -0.4917279870709287
-3.451449134581896e-06, -9.595179260534558e-05, 0.08513443835773901]
First Image Point[275;
204;
1]
Point 3D-W : [0.003070864657310213;
0.0004761913292736786;
0.06461112415423849]
W: 0.0646111
Point 3D: [21.04004290792539;
135.683117651025;
1]
你的推理是合理的,但是你在最后一个部分犯了一些错误..或者我错过了什么?
W除法之前你的结果是:
Point 3D-W :
[0.003070864657310213;
0.0004761913292736786;
0.06461112415423849]
现在我们需要通过将所有坐标除以 W(数组的第三个元素)来标准化它,正如您在问题中所描述的那样。所以:
Point 3D-W Normalized =
[0.003070864657310213 / 0.06461112415423849;
0.0004761913292736786 / 0.06461112415423849;
0.06461112415423849 / 0.06461112415423849]
结果是:
Point 3D-W Normalized =
[0.047528420183179314;
0.007370113668614144;
1.0]
这非常接近 [0,0]。
如上所述,您应该将 point3Dw 除以 w:
divide(point3Dw, w, matPoint3D);
此外,如果您的相机镜头有任何类型的扭曲,您应该在将点从 2d 投影到 3d 之前消除坐标的扭曲:
std::vector<cv::Point2f> point2D;
cv::undistortPoints(point2D, point2DUndisorted, cameraMatrix, distCoeffs);
然后使用 point2DUndisorted 计算投影点。