Abstract:
In underground unmanned transportation tunnels of coal mines, severe dust interference, complex illumination conditions, and frequent target occlusion often exist, leading to insufficient feature representation and feature degradation ofCollision Barrels and traffic lights in images captured by vehicle-mounted cameras, which consequently limits detection accuracy. To address this issue, this paper proposes an improved YOLOv11-based real-time detection algorithm for Collision Barrels and traffic lights, termed YOLOv11-CTDNet (Coal Tunnel Dynamic Detection Network). First, by integrating a Space-to-Depth (SPD) module with a Non-strided Partial Convolution (NsPConv) module, an SPD-NsPConv block is designed to to improve the feature extraction backbone of YOLOv11. This design preserves more fine-grained details from the original feature maps and enhances the model’s capability for small-object detection. On this basis, a novel feature processing module, DS-C3k2_MLCA, is proposed to replace the C3k2 module in the neck network. This module combines depthwise separable convolution with a Multi-Scale Local Channel Attention (MLCA) mechanism, effectively suppressing background noise interference and enhancing discriminative feature representation. Meanwhile, a Hypergraph-based Adaptive Enhancement (HyperACE) mechanism and a Full Pipeline Aggregation and Distribution (FullPAD) paradigm are introduced. The HyperACE mechanism exploits latent high-order relationships to achieve global feature enhancement and fusion across spatial positions and scales, while the FullPAD paradigm distributes the enhanced features to the neck network through independent channels, strengthening the modeling and representation capability of intermediate features for cross-position and cross-scale high-order multivariate relationships. As a consequence, the detection accuracy of multi-scale and partially occluded targets in complex environments is significantly improved. In addition, an extra small-object detection layer is introduced to further enhance the detection performance for small-sized targets. Finally, a DyHead dynamic detection head is employed, which ensures sufficient feature optimization while enabling collaborative and adaptive capture of key information through a three-level attention mechanism, thereby improving multi-scale object detection capability. Although the proposed model introduces a moderate increase in parameters and computational cost (with 2.94 M parameters and 10.7 GFLOPs, representing increases of 0.35 M parameters and 4.4 GFLOPs compared to the baseline), the overall model size remains lightweight and meets the real-time deployment requirements of underground unmanned transportation vehicles in coal mines. Experimental results on a self-constructed coal mine underground transportation tunnel dataset demonstrate that YOLOv11-CTDNet achieves an mAP@0.5 of 92.4%, outperforming YOLOv11n by 9.1 percentage points. On low-illumination datasets and datasets containing partially occluded targets, the proposed model achieves mAP@0.5 scores of 81.2% and 80.8%, representing improvements of 4.9 and 8.5 percentage points over the baseline, respectively, over the baseline. These results indicate that the proposed method achieves a favorable balance between detection performance and practical deployability under complex underground working conditions, demonstrating strong application potential.