The VINS-Mono algorithm, which is based on a visual-inertial SLAM framework,
faces challenges in extracting feature points in regions with weak or repetitive
textures and struggles to achieve accurate localization under unstable lighting
conditions. This paper proposes STO-VINS, a robust monocular visual-inertial
SLAM algorithm that introduces several key innovations in feature extraction.
Key innovations of STO-VINS include: (1) an adaptive multi-scale image
preprocessing pipeline that combines image scaling, CLAHE enhancement, and
Gaussian filtering, reducing computational complexity by 64% while maintaining
feature quality; (2) bidirectional Lucas-Kanade optical flow consistency
verification with geometric constraint validation, which significantly reduces
false tracking rates by 30-40%; (3) a grid-based multi-feature fusion detection
strategy combining Shi-Tomasi corner detection and ORB feature extraction,
ensuring uniform spatial distribution of features and feature diversity; (4) an
intelligent dynamic parameter adjustment system that optimizes detection
parameters based on multi-dimensional image quality assessments (brightness
histogram analysis, Laplacian blur metric, and Canny edge density); and (5) a
smart feature point quality filtering mechanism that implements distance-based
deduplication and non-maximum suppression to retain the optimal 100 feature
points. These innovations offer three key advantages: enhanced robustness
through multi-algorithm fusion and consistency verification, improved
computational efficiency through multi-scale processing and grid-based
detection, and superior environmental adaptability through intelligent parameter
optimization. Experimental validation using the EuRoC dataset shows that
STO-VINS achieves a 6.5% improvement in localization accuracy over VINS-Mono in
non-loop closure scenarios. Further outdoor scene experiments confirm that,
while VINS-Mono suffers from severe trajectory drift, STO-VINS produces
trajectories that closely match the experimental route with minimal drift error.
The results demonstrate that STO-VINS significantly improves feature point
extraction in challenging environments and offers a new paradigm for
intelligent, adaptive, and high-efficiency feature tracking in SLAM systems,
leading to substantial improvements in real-time performance, system stability,
and environmental adaptability.