Vision - Based Object Tracking for UAV in National - level Student Research Training Program (SRTP)

Project overview

Leading a national-level Student Research Training Program (SRTP) titled "Vision-Based Object Tracking for UAV," I orchestrated the development of a complete UAV system by integrating key components: mapping, planning, target recognition, localization, and control. Building upon FastLab's Elastic tracker as a foundation, I have enhanced the robustness of the recognition algorithm.

Target tracking and control

Hierarchical Multi-goal Path Finding

This subsection addresses the challenge of finding an efficient path for the drone that not only reaches the target but also considers potential obstacles and occlusion issues. This is a hierarchical method that first identifies multiple goals or waypoints that the drone should reach to maintain visibility of the target. These goals are selected based on their distance to the target and potential occlusion. A greedy algorithm is employed to enhance the efficiency of the path-finding process. This algorithm ensures that the path is not only safe and visible but also computationally feasible for real-time applications.

Safe Flight Corridor Generation

This section discusses the generation of a safe flight corridor (SFC) that the drone can follow without colliding with obstacles. Previous methods in safe flight corridor generation often faced limitations, especially in complex environments where the drone must maneuver through tight spaces or avoid moving obstacles. I utilize a novel approach to generate a more flexible and safer corridor, which adapts to the environment and the drone's dynamic constraints, ensuring a higher level of safety during flight.

Visible Region Generation

The aim here is to ensure that the drone maintains a line of sight to the target during the tracking mission. A key contribution of this subsection is the development of a sector-shaped visible region for each predicted position of the target. This visible region is designed to account for potential occlusions and to guide the drone in maintaining an optimal position for visibility. It is a crucial part of the system, ensuring that the target remains within the drone's field of view throughout the tracking process.

Target recognition and localization

A formidable obstacle surfaced: the low accuracy of image recognition, attributed to inadequate data stemming from the diverse scenes encountered during UAV flights.

To tackle this challenge, we implemented innovative measures to augment our dataset. Recording videos enabled us to capture dynamic UAV images in flight, expanding the range of perspectives and scenarios for more diverse training data. Additionally, we incorporated a jitter to simulate visual instability encountered in actual flights, enhancing the realism of our training data.

	original detection	improved detection
Precision(%)	63	98
Recall(%)	36	97

Through persistent efforts and optimization, our image recognition accuracy experienced a substantial boost, with recall rates increasing from 36% to 97% and precision surging from 68% to 98%. This remarkable improvement has significantly bolstered the robustness of our robot tracking. Particularly noteworthy is that when the confidence (threshold) is set at 0.5, the system demonstrates exceptionally high reliability and efficiency.

The left side of the image shows examples of objects identified by the trained model, while the right side displays charts of various training metrics. These charts include changes in the loss function, as well as variations in evaluation metrics such as precision, recall, and mean Average Precision (mAP).

Loss Function: The train/box_loss, train/obj_loss, train/cls_loss, and corresponding validation set losses val/box_loss, val/obj_loss, val/cls_loss all show a gradual decrease with increasing training epochs, indicating that the model is progressively improving, learning to better identify and classify objects.

Precision and Recall: The metrics/precision and metrics/recall charts show that both these metrics are high, close to 1.0, indicating that the model can correctly identify most true positives and rarely misclassifies negatives as positives.

Mean Average Precision (mAP): The metrics/mAP_0.5 and metrics/mAP_0.5:0.95 represent the mAP at IoU (Intersection over Union) thresholds of 0.5 and between 0.5 to 0.95, respectively. Both of these metrics also exhibit high values, especially the metrics/mAP_0.5 is near perfect, signifying that the model performs well across different IoU thresholds