ROS-3DSLAM: visual estimator section III initial 3

2021@SDUSC

Sunday, November 14, 2021 - Monday, November 15, 2021

1, Background:

This time, continue to analyze the contents of the initial folder. This time, it is for the solution_ 5pts, including the header file and cpp file of the file.

Solve here_ 5pts means to use the five point method to find the essential matrix, and the code is the realization of this function.

Before code analysis, it is necessary to learn the five point method.

In the field of slam and sfm, it is an important task to recover the camera pose and 3D point coordinates. Describing the relationship between 3D points of a scene and the image coordinates of different cameras is called epipolar geometry. The matrices describing epipolar geometric relations usually include fundamental matrix, essential matrix and homography matrix. There are 7-point method and 8-point method to solve the basic matrix; There are 5-point method and 8-point method to solve the basic matrix; The homography matrix is solved by DLT algorithm.

In the epipolar geometric relation matrix, the solution of homography matrix, the 8-point method of basic matrix and essential matrix are collectively referred to as linear solution, while the 7-point method of basic matrix and the 5-point method of essential matrix are referred to as nonlinear solution. In particular, the 5-point method of essential matrix is very complex. Most textbooks calculate the basic matrix first, Then, the essential matrix is solved by using the internal parameter and the basic matrix (such as the solution of the essential matrix in orb2 SLAM). The solution of the 5-point method involves the solution of the 10th degree polynomial, so there are few relevant materials (including the photo grammetry course of Bonn University, and the process of solving the E-matrix by the 5-point method is also skipped directly).

It can be seen that the process of finding the essential matrix by the five point method here is very complex, but if you want to analyze the code, you have to learn the algorithm. Therefore, I decided to browse the basic algorithm flow first, and then conduct further analysis according to the code.

In the process of reading, I found that the original focus of this document is not to realize the calculation of the five point method, but to use

We maintain a frame sliding window to limit the computational complexity. First, we check the feature correspondence between the latest frame and all previous frames. If we can find stable feature tracking (more than 30 tracking features) and sufficient parallax (more than 20 rotation compensated pixels) between the latest frame in the sliding window and any other frame, we recover the relative rotation and the upper scale translation between the two frames using the five point algorithm.

Five point algorithm to get the scale, so the recoverPose function here is the focus.

When dealing with the relationship solveRelativeRT of related frames, the essential matrix is calculated and transmitted to recoverPose for transformation, during which decomposeE can be used for decomposition.

I mentioned the role of testtranslation in the last analysis, but I'm still not sure about the function. There are questions. I don't quite understand why to arrange the structure like this.

In short, the function of this file is to use the five point algorithm to restore the relative rotation and the scale between two frames.

10. Five point algorithm:

P set matrix expression

A extract feature vector

B add constraint equation

C Gaussian elimination method

E solving essential matrix

2, Code analysis:

1. Header file

The data structure is not defined, but the statements of three functions are given. According to the English name, they are "processing related frame relationship", "test triangulation" and "decomposing essential matrix".

decomposeE is as like as two peas in the last [initial_ex_rotation], and here is another one?

Why is it set to private here?

class MotionEstimator
{
  public:

    bool solveRelativeRT(const vector<pair<Vector3d, Vector3d>> &corres, Matrix3d &R, Vector3d &T);

  private:
    double testTriangulation(const vector<cv::Point2f> &l,
                             const vector<cv::Point2f> &r,
                             cv::Mat_<double> R, cv::Mat_<double> t);
    void decomposeE(cv::Mat E,
                    cv::Mat_<double> &R1, cv::Mat_<double> &R2,
                    cv::Mat_<double> &t1, cv::Mat_<double> &t2);
};

2,solve_5pts

#include "solve_5pts.h"


namespace cv {
    //Decomposition essential matrix
    void decomposeEssentialMat( InputArray _E, OutputArray _R1, OutputArray _R2, OutputArray _t )
    {
        Mat E = _E.getMat().reshape(1, 3);
        CV_Assert(E.cols == 3 && E.rows == 3);

        Mat D, U, Vt;
        //svd decomposition method is used
        SVD::compute(E, D, U, Vt);

        if (determinant(U) < 0) U *= -1.;
        if (determinant(Vt) < 0) Vt *= -1.;

        Mat W = (Mat_<double>(3, 3) << 0, 1, 0, -1, 0, 0, 0, 0, 1);
        W.convertTo(W, E.type());

        Mat R1, R2, t;
        R1 = U * W * Vt;
        R2 = U * W.t() * Vt;
        t = U.col(2) * 1.0;

        R1.copyTo(_R1);
        R2.copyTo(_R2);
        t.copyTo(_t);
    }
	//Recover the pose. Note that the inputs here are the essential matrix, the first image, the second image, the camera internal parameters, the rotation matrix from the first frame to the second frame, the translation vector from the first frame to the second frame, and the points marked without abandonment
    int recoverPose( InputArray E, InputArray _points1, InputArray _points2, InputArray _cameraMatrix,
                         OutputArray _R, OutputArray _t, InputOutputArray _mask)
    {
		//The incoming parameters change format
        Mat points1, points2, cameraMatrix;
        _points1.getMat().convertTo(points1, CV_64F);
        _points2.getMat().convertTo(points2, CV_64F);
        _cameraMatrix.getMat().convertTo(cameraMatrix, CV_64F);

        int npoints = points1.checkVector(2);
        CV_Assert( npoints >= 0 && points2.checkVector(2) == npoints &&
                                  points1.type() == points2.type());

        CV_Assert(cameraMatrix.rows == 3 && cameraMatrix.cols == 3 && cameraMatrix.channels() == 1);

        if (points1.channels() > 1)
        {
            points1 = points1.reshape(1, npoints);
            points2 = points2.reshape(1, npoints);
        }
		//Is this adjusted using the camera's reference system?
        double fx = cameraMatrix.at<double>(0,0);
        double fy = cameraMatrix.at<double>(1,1);
        double cx = cameraMatrix.at<double>(0,2);
        double cy = cameraMatrix.at<double>(1,2);

        points1.col(0) = (points1.col(0) - cx) / fx;
        points2.col(0) = (points2.col(0) - cx) / fx;
        points1.col(1) = (points1.col(1) - cy) / fy;
        points2.col(1) = (points2.col(1) - cy) / fy;

        points1 = points1.t();
        points2 = points2.t();

        Mat R1, R2, t;
        decomposeEssentialMat(E, R1, R2, t);
        Mat P0 = Mat::eye(3, 4, R1.type());
        Mat P1(3, 4, R1.type()), P2(3, 4, R1.type()), P3(3, 4, R1.type()), P4(3, 4, R1.type());
        P1(Range::all(), Range(0, 3)) = R1 * 1.0; P1.col(3) = t * 1.0;
        P2(Range::all(), Range(0, 3)) = R2 * 1.0; P2.col(3) = t * 1.0;
        P3(Range::all(), Range(0, 3)) = R1 * 1.0; P3.col(3) = -t * 1.0;
        P4(Range::all(), Range(0, 3)) = R2 * 1.0; P4.col(3) = -t * 1.0;

        // Do the cheirality check.
        // Notice here a threshold dist is used to filter
        // out far away points (i.e. infinite points) since
        // there depth may vary between postive and negtive.
        double dist = 50.0;
        Mat Q;
        triangulatePoints(P0, P1, points1, points2, Q);
        Mat mask1 = Q.row(2).mul(Q.row(3)) > 0;
        Q.row(0) /= Q.row(3);
        Q.row(1) /= Q.row(3);
        Q.row(2) /= Q.row(3);
        Q.row(3) /= Q.row(3);
        mask1 = (Q.row(2) < dist) & mask1;
        Q = P1 * Q;
        mask1 = (Q.row(2) > 0) & mask1;
        mask1 = (Q.row(2) < dist) & mask1;

        triangulatePoints(P0, P2, points1, points2, Q);
        Mat mask2 = Q.row(2).mul(Q.row(3)) > 0;
        Q.row(0) /= Q.row(3);
        Q.row(1) /= Q.row(3);
        Q.row(2) /= Q.row(3);
        Q.row(3) /= Q.row(3);
        mask2 = (Q.row(2) < dist) & mask2;
        Q = P2 * Q;
        mask2 = (Q.row(2) > 0) & mask2;
        mask2 = (Q.row(2) < dist) & mask2;

        triangulatePoints(P0, P3, points1, points2, Q);
        Mat mask3 = Q.row(2).mul(Q.row(3)) > 0;
        Q.row(0) /= Q.row(3);
        Q.row(1) /= Q.row(3);
        Q.row(2) /= Q.row(3);
        Q.row(3) /= Q.row(3);
        mask3 = (Q.row(2) < dist) & mask3;
        Q = P3 * Q;
        mask3 = (Q.row(2) > 0) & mask3;
        mask3 = (Q.row(2) < dist) & mask3;

        triangulatePoints(P0, P4, points1, points2, Q);
        Mat mask4 = Q.row(2).mul(Q.row(3)) > 0;
        Q.row(0) /= Q.row(3);
        Q.row(1) /= Q.row(3);
        Q.row(2) /= Q.row(3);
        Q.row(3) /= Q.row(3);
        mask4 = (Q.row(2) < dist) & mask4;
        Q = P4 * Q;
        mask4 = (Q.row(2) > 0) & mask4;
        mask4 = (Q.row(2) < dist) & mask4;

        mask1 = mask1.t();
        mask2 = mask2.t();
        mask3 = mask3.t();
        mask4 = mask4.t();
		//Filter output
        // If _mask is given, then use it to filter outliers.
        if (!_mask.empty())
        {
            Mat mask = _mask.getMat();
            CV_Assert(mask.size() == mask1.size());
            bitwise_and(mask, mask1, mask1);
            bitwise_and(mask, mask2, mask2);
            bitwise_and(mask, mask3, mask3);
            bitwise_and(mask, mask4, mask4);
        }
        if (_mask.empty() && _mask.needed())
        {
            _mask.create(mask1.size(), CV_8U);
        }

        CV_Assert(_R.needed() && _t.needed());
        _R.create(3, 3, R1.type());
        _t.create(3, 1, t.type());

        int good1 = countNonZero(mask1);
        int good2 = countNonZero(mask2);
        int good3 = countNonZero(mask3);
        int good4 = countNonZero(mask4);

        if (good1 >= good2 && good1 >= good3 && good1 >= good4)
        {
            R1.copyTo(_R);
            t.copyTo(_t);
            if (_mask.needed()) mask1.copyTo(_mask);
            return good1;
        }
        else if (good2 >= good1 && good2 >= good3 && good2 >= good4)
        {
            R2.copyTo(_R);
            t.copyTo(_t);
            if (_mask.needed()) mask2.copyTo(_mask);
            return good2;
        }
        else if (good3 >= good1 && good3 >= good2 && good3 >= good4)
        {
            t = -t;
            R1.copyTo(_R);
            t.copyTo(_t);
            if (_mask.needed()) mask3.copyTo(_mask);
            return good3;
        }
        else
        {
            t = -t;
            R2.copyTo(_R);
            t.copyTo(_t);
            if (_mask.needed()) mask4.copyTo(_mask);
            return good4;
        }
    }
	//Handling different parameters?: What happened?
    int recoverPose( InputArray E, InputArray _points1, InputArray _points2, OutputArray _R,
                         OutputArray _t, double focal, Point2d pp, InputOutputArray _mask)
    {
        Mat cameraMatrix = (Mat_<double>(3,3) << focal, 0, pp.x, 0, focal, pp.y, 0, 0, 1);
        return cv::recoverPose(E, _points1, _points2, cameraMatrix, _R, _t, _mask);
    }
}

//Bring in corresponding parameters
bool MotionEstimator::solveRelativeRT(const vector<pair<Vector3d, Vector3d>> &corres, Matrix3d &Rotation, Vector3d &Translation)
{
    if (corres.size() >= 15)
    {
        vector<cv::Point2f> ll, rr;
        for (int i = 0; i < int(corres.size()); i++)
        {
            ll.push_back(cv::Point2f(corres[i].first(0), corres[i].first(1)));
            rr.push_back(cv::Point2f(corres[i].second(0), corres[i].second(1)));
        }
        // The mask here should be a sliding window( ×) Is the output information, which points have not been discarded (√)
        cv::Mat mask;
        //Finding essential matrix
        cv::Mat E = cv::findFundamentalMat(ll, rr, cv::FM_RANSAC, 0.3 / 460, 0.99, mask);
        cv::Mat cameraMatrix = (cv::Mat_<double>(3, 3) << 1, 0, 0, 0, 1, 0, 0, 0, 1);
        cv::Mat rot, trans;
        //Reply to the pose, the????
        int inlier_cnt = cv::recoverPose(E, ll, rr, cameraMatrix, rot, trans, mask);
        //cout << "inlier_cnt " << inlier_cnt << endl;
		//R and T are the rotation and translation of the latest frame to frame l????
        Eigen::Matrix3d R;
        Eigen::Vector3d T;
        for (int i = 0; i < 3; i++)
        {   
            T(i) = trans.at<double>(i, 0);
            for (int j = 0; j < 3; j++)
                R(i, j) = rot.at<double>(i, j);
        }

        Rotation = R.transpose();
        Translation = -R.transpose() * T;
        if(inlier_cnt > 12)
            return true;
        else
            return false;
    }
    return false;
}

3, Questions and answers:

1. What does the beginning nameplace cv{} mean? What is its function?

Introduction to cv::Mat
Before using the c + + interface, include the corresponding opencv namespace
After using the #include statement to include the corresponding header file, use the following statement to include the corresponding opencv namespace
using namespace cv;

If there is no such statement, the related resources in this namespace need to be prefixed with cv, such as cv:Mat, which means that the Mat in the namespace cv is used;
With the using namespace cv statement, you can write Mat directly.

It is the namespace used to make use of the function package in open cv. It is more convenient to write in this way.

2. cv::findFundamentalMat, including the recoverPose in the back. Why did you pass in the initialized matrix? mask, rot, trans?

findFundamentalMat function is also a encapsulated algorithm in open cv, which can be used to calculate and solve the essential matrix.

status

For an output array with N elements, if there is no point discarded in the calculation process, the element is set to 1; otherwise, it is set to 0. This array can only be used in the case of methods RANSAC and LMedS; in the case of other methods, status is always set to 1. This parameter is optional.

/**

Mat CV:: findfundamentalmat (returns the essential matrix E between two images solved by RANSAC algorithm)
nputArray points1, array of points of the first image
InputArray points2, array of points of the second image
int method = FM_RANSAC, RANSAC algorithm
double param1 = 3. The maximum distance from the point to the epipolar line. Points exceeding this value will be discarded
double param2 = 0.99, reliability of correct matrix
OutputArray mask = noArray() outputs points that have not been discarded during the calculation
)
/
/*
int cv::recoverPose (get Rt through the essential matrix and return the number of interior points that have passed the chirality check)
InputArray E, essential matrix
InputArray points1, array of points of the first image
InputArray points2, array of points of the second image
InputArray cameraMatrix, camera internal parameters
OutputArray R, rotation matrix from the first frame coordinate system to the second frame coordinate system
OutputArray t, the translation vector from the first frame coordinate system to the second frame coordinate system
InputOutputArray mask = noArray() has no discarded points in findFundamentalMat()
)
*/

Combined with this part of the parameter interpretation table, we can well understand why an initialized matrix is passed in - as a container to accept the calculation results.

3. What is the CV_assert function and what is its function?

CV_ Assert (SRC. Rows + + 3 & & Src. Cols = = 3), actually CV_ The assert() function has basically the same functions as the assert() function in the C + + standard library.

CV_Assert() function: CV_Assert() returns an error message if the value of the expression in parentheses is false.

The source code in the open cv library can be read... I think it's necessary to learn the basic knowledge of opencv, otherwise it's too hard to read

4, Summary:

After two days of hard reading, I think it's really difficult to push forward, so for the time being, put aside the confusion in this article, find a way to supplement some opencv knowledge, and then have an in-depth understanding of the specific principles of this part of the code. Then deal with these problems, otherwise the efficiency is too low

Next, continue to analyze initial or learn opencv