基于改进YOLOv11的煤矿井下交通灯与防撞桶实时检测算法

张传伟; 姚豪(通讯作者); 江吴兵; 张天乐; 尚召斌

基于改进YOLOv11的煤矿井下交通灯与防撞桶实时检测算法

Real-Time Detection of Traffic Lights and Collision Barrels in Underground Coal Mines Based on YOLOv11-CTDNet

摘要

摘要: 针对煤矿井下无人运输巷道中粉尘干扰严重、光照条件复杂、目标易遮挡，致使车载摄像头采集的图像中防撞桶与交通灯特征表达不足、易退化，从而检测精度受限的问题，提出一种基于YOLOv11网络改进的煤矿井下防撞桶与交通灯实时检测算法——YOLOv11-CTDNet(Coal Tunnel Dynamic Detection Network)。首先，结合空间至深度模块(Space-to-depth,SPD)和非跨步部分卷积模块(Non-strided Partial Convolution,NsPConv)设计SPD-NsPConv模块对YOLOv11的特征提取网络进行改进，保留原始特征图更多细节特征，提高模型对小目标的检测能力；在此基础上，设计了一种新的特征处理模块DS-C3k2_MLCA，替换颈部网络的C3k2模块，该模块融合深度可分离卷积模块和混合局部通道注意力机制(Multi-Scale Local Channel Attention,MLCA)，抑制背景噪声对特征信息的干扰；与此同时，引入基于超图的自适应增强(HyperACE)机制和全流水线聚合与分发(FullPAD)范式。HyperACE机制利用潜在高阶关系实现全局跨位置和跨尺度特征增强和融合，FullPAD范式将关联增强后的特征沿独立通道分发到颈部网络，增强中间层特征对跨位置与跨尺度高阶多元关系的建模与表达能力，从而提高模型在复杂环境下对多尺度目标和部分遮挡目标的检测精度；此外，增加一个小目标检测层，提升模型对小尺寸目标的检测性能；最后，采用DyHead动态检测头，保障特征充分优化的同时借助三级注意力机制实现了对关键信息的协同自适应捕捉，增强模型对多尺度目标的检测能力。尽管所提出模型在参数量和计算量上有所增加（Params为2.94 M，GFLOPs为10.7），相较基线模型分别提升0.35 M和4.4 GFLOPs，但整体规模仍处于轻量化范围内，能够满足煤矿井下无人驾驶运输车辆的实时部署需求。基于自建煤矿井下运输巷道数据集的实验结果表明，YOLOv11-CTDNet的mAP@0.5达到92.4%，较YOLOv11n提升9.1个百分点；在低光照数据集和部分遮挡目标数据集上，其mAP@0.5分别达到81.2%和 80.8%，相较基线模型分别提升4.9个百分点和8.5个百分点。实验结果表明，所提模型在复杂井下工况下实现了检测性能提升与工程部署可行性之间的良好平衡，具有良好的实际应用价值。

Abstract: In underground unmanned transportation tunnels of coal mines, severe dust interference, complex illumination conditions, and frequent target occlusion often exist, leading to insufficient feature representation and feature degradation ofCollision Barrels and traffic lights in images captured by vehicle-mounted cameras, which consequently limits detection accuracy. To address this issue, this paper proposes an improved YOLOv11-based real-time detection algorithm for Collision Barrels and traffic lights, termed YOLOv11-CTDNet (Coal Tunnel Dynamic Detection Network). First, by integrating a Space-to-Depth (SPD) module with a Non-strided Partial Convolution (NsPConv) module, an SPD-NsPConv block is designed to to improve the feature extraction backbone of YOLOv11. This design preserves more fine-grained details from the original feature maps and enhances the model’s capability for small-object detection. On this basis, a novel feature processing module, DS-C3k2_MLCA, is proposed to replace the C3k2 module in the neck network. This module combines depthwise separable convolution with a Multi-Scale Local Channel Attention (MLCA) mechanism, effectively suppressing background noise interference and enhancing discriminative feature representation. Meanwhile, a Hypergraph-based Adaptive Enhancement (HyperACE) mechanism and a Full Pipeline Aggregation and Distribution (FullPAD) paradigm are introduced. The HyperACE mechanism exploits latent high-order relationships to achieve global feature enhancement and fusion across spatial positions and scales, while the FullPAD paradigm distributes the enhanced features to the neck network through independent channels, strengthening the modeling and representation capability of intermediate features for cross-position and cross-scale high-order multivariate relationships. As a consequence, the detection accuracy of multi-scale and partially occluded targets in complex environments is significantly improved. In addition, an extra small-object detection layer is introduced to further enhance the detection performance for small-sized targets. Finally, a DyHead dynamic detection head is employed, which ensures sufficient feature optimization while enabling collaborative and adaptive capture of key information through a three-level attention mechanism, thereby improving multi-scale object detection capability. Although the proposed model introduces a moderate increase in parameters and computational cost (with 2.94 M parameters and 10.7 GFLOPs, representing increases of 0.35 M parameters and 4.4 GFLOPs compared to the baseline), the overall model size remains lightweight and meets the real-time deployment requirements of underground unmanned transportation vehicles in coal mines. Experimental results on a self-constructed coal mine underground transportation tunnel dataset demonstrate that YOLOv11-CTDNet achieves an mAP@0.5 of 92.4%, outperforming YOLOv11n by 9.1 percentage points. On low-illumination datasets and datasets containing partially occluded targets, the proposed model achieves mAP@0.5 scores of 81.2% and 80.8%, representing improvements of 4.9 and 8.5 percentage points over the baseline, respectively, over the baseline. These results indicate that the proposed method achieves a favorable balance between detection performance and practical deployability under complex underground working conditions, demonstrating strong application potential.

HTML全文

参考文献(0)

施引文献

资源附件(0)