2021SC@SDUSC
visual_ feature_ img_ Analysis of callback
In the previous blog, we analyzed one of the callback functions lidar in LVI-SAM_ callback. Through the last blog, we know lidar_callback subscribes to the lidar message and obtains the read-up information through processing, and then puts the dot graph with depth information into the shared memory for visual_ Another callback function in the feature uses. This is also the problem that we need to solve in this blog.
From the 9th blog, we can know img_ The function of callback is to track the feature points of the new image. This is also the main task of this node. Let's start to analyze this callback function, and then the blog will study every detail of this function.
0. Initialization and correctness verification
At the beginning, first judge whether it is the first frame. If so, initialize the data and record the time of the first image frame.
After that, you need to judge whether the camera data stream is stable. If it is unstable, you need to issue a restart message to restart the system. Here, the method to judge whether the data stream is stable is to see that the arrival time of each image frame is too different from that of the previous frame, or it is older than that of the previous frame. Also, you need to reset the data before issuing restart.
double cur_img_time = img_msg->header.stamp.toSec(); // Process the first frame image if(first_image_flag) { first_image_flag = false; first_image_time = cur_img_time; last_image_time = cur_img_time; return; } // Handling unstable data streams if (cur_img_time - last_image_time > 1.0 || cur_img_time < last_image_time) { ROS_WARN("image discontinue! reset the feature tracker!"); // Reset first_image_flag = true; last_image_time = 0; pub_count = 1; std_msgs::Bool restart_flag; restart_flag.data = true; pub_restart.publish(restart_flag); return; }
1. Data preparation
In preparation, we need to control the release frequency and convert image coding.
a. Control release frequency
We need to control the frequency of publishing feature points. We don't publish feature points every time we read in a frame of image, because the frequency of reading in image frames is very fast. In order to avoid excessive load on the communication mechanism of ROS, we need to control the frequency of publishing.
if (round(1.0 * pub_count / (cur_img_time - first_image_time)) <= FREQ) { PUB_THIS_FRAME = true; // reset the frequency control if (abs(1.0 * pub_count / (cur_img_time - first_image_time) - FREQ) < 0.01 * FREQ) { first_image_time = cur_img_time; pub_count = 0; } } else { PUB_THIS_FRAME = false; }
Several of these parameters need to be described.
pub_count: # Number of published images FREQ: 20 # Control the frequency of image optical flow tracking. Here, the author sets it as 20HZ in the parameter configuration file PUB_THIS_FRAME: # Do you want to publish the flag of feature points
Therefore, the meaning of the whole logic control block is: the cumulative release quantity / the time distance of the currently received picture is less than 20HZ (adjustable), and the release is allowed. Meanwhile, if the time interval of the received picture is relatively long and the publishing frequency is lower than 2HZ (adjustable), reset the flow control, set the currently received aristocratic family as the time of the first picture, and reset the pub_count.
In doing so, I guess in order to avoid fluctuations. For example, I haven't received any pictures for a long time (possibly due to the network), and then I suddenly received a large number of pictures. In this way, the later published pictures will exceed the frequency of 20HZ, but the average whole frequency does not exceed 20HZ. Therefore, the author's approach is more ingenious and can avoid large fluctuations.
b. Image format adjustment and image reading
Read sensor_msgs::Image img data and convert it to mono8 format, using cv::Mat show_img receive. Here, 8uc1 is an 8-bit monochromatic grayscale image. Mono8 is an 8uc1 format. I think this is to convert the image from ROS standard to OpenCV standard, so as to facilitate subsequent processing.
Let's simply say cv_bridge. cv_ The toCVCopy function of bridge converts ROS image messages into opencv images. ROS with its own sensor_msgs / Image message format delivers images, but many users want to use images with OpenCV. CvBridge is a ROS library that provides an interface between ROS and OpenCV. Can be in vision_opencv stack_ CvBridge found in bridge package.
cv_bridge::CvImageConstPtr ptr; if (img_msg->encoding == "8UC1") { sensor_msgs::Image img; img.header = img_msg->header; img.height = img_msg->height; img.width = img_msg->width; img.is_bigendian = img_msg->is_bigendian; img.step = img_msg->step; img.data = img_msg->data; img.encoding = "mono8"; ptr = cv_bridge::toCvCopy(img, sensor_msgs::image_encodings::MONO8); } else ptr = cv_bridge::toCvCopy(img_msg, sensor_msgs::image_encodings::MONO8);
2. Feature point extraction and optical flow tracking of the latest frame (core)
img_ The core statement of callback () appears here, that is, readImage(). This function implements feature processing and optical flow tracking, in which feature is basically called_ All functions in tracker.cpp. Firstly, monocular and binocular judgment is carried out.
If it is a monocular camera, call readImage(), which is the core. But we will analyze the blog later. If it is a binocular camera, adaptive histogram equalization is required. Here, equal is 1 if the light is too bright or too dark, which is used for histogram equalization.
There are also the first two lines and some small details. See the code comments below.
cv::Mat show_img = ptr->image; //img_msg or img are sensors_ In MSG format, we need a bridge to convert the data into CV::Mat format for subsequent image processing. TicToc t_r; // Class for calculating time for (int i = 0; i < NUM_OF_CAM; i++) { ROS_DEBUG("processing camera %d", i); if (i != 1 || !STEREO_TRACK) trackerData[i].readImage(ptr->image.rowRange(ROW * i, ROW * (i + 1)), cur_img_time); else { if (EQUALIZE) { cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(); clahe->apply(ptr->image.rowRange(ROW * i, ROW * (i + 1)), trackerData[i].cur_img); } else trackerData[i].cur_img = ptr->image.rowRange(ROW * i, ROW * (i + 1)); } #if SHOW_UNDISTORTION trackerData[i].showUndistortion("undistrotion_" + std::to_string(i)); #endif }
3. Update the global id of the newly added feature points
If completed (or update()) is true, it indicates that the id has not been updated, and the cycle continues. If false, it indicates that the update is completed, and the cycle jumps out. Attention n_id is static data and has the function of accumulation.
for (unsigned int i = 0;; i++) { bool completed = false; for (int j = 0; j < NUM_OF_CAM; j++) if (j != 1 || !STEREO_TRACK) completed |= trackerData[j].updateID(i); if (!completed) break; }
4. Correct, encapsulate and publish feature points to pub_feature
See code comments for detailed steps here.
The feature point id, the 3D point of the normalized plane after correction (x,y,z=1), the pixel 2D point (u,v), and the pixel speed (vx,vy) are encapsulated into a sensor_ Msgs:: feature of pointcloudptr type_ In the points instance, publish to pub_img.
// The outermost if statement is omitted here. See the appendix code below for details. pub_count++; //Release quantity + 1 // Used to encapsulate published information sensor_msgs::PointCloudPtr feature_points(new sensor_msgs::PointCloud); sensor_msgs::ChannelFloat32 id_of_point; sensor_msgs::ChannelFloat32 u_of_point; // Pixel coordinate x sensor_msgs::ChannelFloat32 v_of_point; // Pixel coordinate y sensor_msgs::ChannelFloat32 velocity_x_of_point; // Speed x sensor_msgs::ChannelFloat32 velocity_y_of_point; // Velocity y feature_points->header.stamp = img_msg->header.stamp; feature_points->header.frame_id = "vins_body"; vector<set<int>> hash_ids(NUM_OF_CAM); // Hash table id for (int i = 0; i < NUM_OF_CAM; i++) // Number of circular cameras { auto &un_pts = trackerData[i].cur_un_pts; auto &cur_pts = trackerData[i].cur_pts; auto &ids = trackerData[i].ids; auto &pts_velocity = trackerData[i].pts_velocity; for (unsigned int j = 0; j < ids.size(); j++) // Number of feature points { if (trackerData[i].track_cnt[j] > 1) // Only feature points with tracking times greater than 1 are published { int p_id = ids[j]; hash_ids[i].insert(p_id); geometry_msgs::Point32 p; p.x = un_pts[j].x; p.y = un_pts[j].y; p.z = 1; feature_points->points.push_back(p); id_of_point.values.push_back(p_id * NUM_OF_CAM + i); u_of_point.values.push_back(cur_pts[j].x); v_of_point.values.push_back(cur_pts[j].y); velocity_x_of_point.values.push_back(pts_velocity[j].x); velocity_y_of_point.values.push_back(pts_velocity[j].y); } } } // Encapsulate information and prepare for release feature_points->channels.push_back(id_of_point); feature_points->channels.push_back(u_of_point); feature_points->channels.push_back(v_of_point); feature_points->channels.push_back(velocity_x_of_point); feature_points->channels.push_back(velocity_y_of_point); // Get depth information from shared memory pcl::PointCloud<PointType>::Ptr depth_cloud_temp(new pcl::PointCloud<PointType>()); mtx_lidar.lock(); *depth_cloud_temp = *depthCloud; mtx_lidar.unlock(); // Get depth sensor_msgs::ChannelFloat32 depth_of_points = depthRegister->get_depth(img_msg->header.stamp, show_img, depth_cloud_temp, trackerData[0].m_camera, feature_points->points); feature_points->channels.push_back(depth_of_points); // The first frame is not published because there is no optical flow speed. if (!init_pub) { init_pub = 1; } else pub_feature.publish(feature_points);
5. Package and publish to pub_match
Encapsulate image into CV_ Bridge:: publish to pub in ptr instance of cvtcolor type_ match.
// Publish features in images if (pub_match.getNumSubscribers() != 0) { ptr = cv_bridge::cvtColor(ptr, sensor_msgs::image_encodings::RGB8); //cv::Mat stereo_img(ROW * NUM_OF_CAM, COL, CV_8UC3); cv::Mat stereo_img = ptr->image; for (int i = 0; i < NUM_OF_CAM; i++) { cv::Mat tmp_img = stereo_img.rowRange(i * ROW, (i + 1) * ROW); cv::cvtColor(show_img, tmp_img, CV_GRAY2RGB); // show_img gray image to RGB (tmp_img) //Display the tracking status. The redder the better, and the bluer the worse -- determined by cv::Scalar for (unsigned int j = 0; j < trackerData[i].cur_pts.size(); j++) { if (SHOW_TRACK) { // Calculate the number of feature points tracked double len = std::min(1.0, 1.0 * trackerData[i].track_cnt[j] / WINDOW_SIZE); cv::circle(tmp_img, trackerData[i].cur_pts[j], 4, cv::Scalar(255 * (1 - len), 255 * len, 0), 4); } else { // Calculation in combination with depth if(j < depth_of_points.values.size()) { if (depth_of_points.values[j] > 0) cv::circle(tmp_img, trackerData[i].cur_pts[j], 4, cv::Scalar(0, 255, 0), 4); else cv::circle(tmp_img, trackerData[i].cur_pts[j], 4, cv::Scalar(0, 0, 255), 4); } } } } pub_match.publish(ptr->toImageMsg()); }
Appendix: Code
void img_callback(const sensor_msgs::ImageConstPtr &img_msg) { double cur_img_time = img_msg->header.stamp.toSec(); if(first_image_flag) { first_image_flag = false; first_image_time = cur_img_time; last_image_time = cur_img_time; return; } // detect unstable camera stream if (cur_img_time - last_image_time > 1.0 || cur_img_time < last_image_time) { ROS_WARN("image discontinue! reset the feature tracker!"); first_image_flag = true; last_image_time = 0; pub_count = 1; std_msgs::Bool restart_flag; restart_flag.data = true; pub_restart.publish(restart_flag); return; } last_image_time = cur_img_time; // frequency control if (round(1.0 * pub_count / (cur_img_time - first_image_time)) <= FREQ) { PUB_THIS_FRAME = true; // reset the frequency control if (abs(1.0 * pub_count / (cur_img_time - first_image_time) - FREQ) < 0.01 * FREQ) { first_image_time = cur_img_time; pub_count = 0; } } else { PUB_THIS_FRAME = false; } cv_bridge::CvImageConstPtr ptr; if (img_msg->encoding == "8UC1") { sensor_msgs::Image img; img.header = img_msg->header; img.height = img_msg->height; img.width = img_msg->width; img.is_bigendian = img_msg->is_bigendian; img.step = img_msg->step; img.data = img_msg->data; img.encoding = "mono8"; ptr = cv_bridge::toCvCopy(img, sensor_msgs::image_encodings::MONO8); } else ptr = cv_bridge::toCvCopy(img_msg, sensor_msgs::image_encodings::MONO8); cv::Mat show_img = ptr->image; TicToc t_r; for (int i = 0; i < NUM_OF_CAM; i++) { ROS_DEBUG("processing camera %d", i); if (i != 1 || !STEREO_TRACK) trackerData[i].readImage(ptr->image.rowRange(ROW * i, ROW * (i + 1)), cur_img_time); else { if (EQUALIZE) { cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(); clahe->apply(ptr->image.rowRange(ROW * i, ROW * (i + 1)), trackerData[i].cur_img); } else trackerData[i].cur_img = ptr->image.rowRange(ROW * i, ROW * (i + 1)); } #if SHOW_UNDISTORTION trackerData[i].showUndistortion("undistrotion_" + std::to_string(i)); #endif } for (unsigned int i = 0;; i++) { bool completed = false; for (int j = 0; j < NUM_OF_CAM; j++) if (j != 1 || !STEREO_TRACK) completed |= trackerData[j].updateID(i); if (!completed) break; } if (PUB_THIS_FRAME) { pub_count++; sensor_msgs::PointCloudPtr feature_points(new sensor_msgs::PointCloud); sensor_msgs::ChannelFloat32 id_of_point; sensor_msgs::ChannelFloat32 u_of_point; sensor_msgs::ChannelFloat32 v_of_point; sensor_msgs::ChannelFloat32 velocity_x_of_point; sensor_msgs::ChannelFloat32 velocity_y_of_point; feature_points->header.stamp = img_msg->header.stamp; feature_points->header.frame_id = "vins_body"; vector<set<int>> hash_ids(NUM_OF_CAM); for (int i = 0; i < NUM_OF_CAM; i++) { auto &un_pts = trackerData[i].cur_un_pts; auto &cur_pts = trackerData[i].cur_pts; auto &ids = trackerData[i].ids; auto &pts_velocity = trackerData[i].pts_velocity; for (unsigned int j = 0; j < ids.size(); j++) { if (trackerData[i].track_cnt[j] > 1) { int p_id = ids[j]; hash_ids[i].insert(p_id); geometry_msgs::Point32 p; p.x = un_pts[j].x; p.y = un_pts[j].y; p.z = 1; feature_points->points.push_back(p); id_of_point.values.push_back(p_id * NUM_OF_CAM + i); u_of_point.values.push_back(cur_pts[j].x); v_of_point.values.push_back(cur_pts[j].y); velocity_x_of_point.values.push_back(pts_velocity[j].x); velocity_y_of_point.values.push_back(pts_velocity[j].y); } } } feature_points->channels.push_back(id_of_point); feature_points->channels.push_back(u_of_point); feature_points->channels.push_back(v_of_point); feature_points->channels.push_back(velocity_x_of_point); feature_points->channels.push_back(velocity_y_of_point); // get feature depth from lidar point cloud pcl::PointCloud<PointType>::Ptr depth_cloud_temp(new pcl::PointCloud<PointType>()); mtx_lidar.lock(); *depth_cloud_temp = *depthCloud; mtx_lidar.unlock(); sensor_msgs::ChannelFloat32 depth_of_points = depthRegister->get_depth(img_msg->header.stamp, show_img, depth_cloud_temp, trackerData[0].m_camera, feature_points->points); feature_points->channels.push_back(depth_of_points); // skip the first image; since no optical speed on frist image if (!init_pub) { init_pub = 1; } else pub_feature.publish(feature_points); // publish features in image if (pub_match.getNumSubscribers() != 0) { ptr = cv_bridge::cvtColor(ptr, sensor_msgs::image_encodings::RGB8); //cv::Mat stereo_img(ROW * NUM_OF_CAM, COL, CV_8UC3); cv::Mat stereo_img = ptr->image; for (int i = 0; i < NUM_OF_CAM; i++) { cv::Mat tmp_img = stereo_img.rowRange(i * ROW, (i + 1) * ROW); cv::cvtColor(show_img, tmp_img, CV_GRAY2RGB); for (unsigned int j = 0; j < trackerData[i].cur_pts.size(); j++) { if (SHOW_TRACK) { // track count double len = std::min(1.0, 1.0 * trackerData[i].track_cnt[j] / WINDOW_SIZE); cv::circle(tmp_img, trackerData[i].cur_pts[j], 4, cv::Scalar(255 * (1 - len), 255 * len, 0), 4); } else { // depth if(j < depth_of_points.values.size()) { if (depth_of_points.values[j] > 0) cv::circle(tmp_img, trackerData[i].cur_pts[j], 4, cv::Scalar(0, 255, 0), 4); else cv::circle(tmp_img, trackerData[i].cur_pts[j], 4, cv::Scalar(0, 0, 255), 4); } } } } pub_match.publish(ptr->toImageMsg()); } } }