Abstract:
The operating environment of open-pit mines is complex, and mechanical targets in UAV remote-sensing images often exhibit significant scale variations, complex background interference, and a high risk of missed detections of small targets. To address these problems, a lightweight model named CDMG-YOLO was proposed for small mechanical target detection in UAV remote sensing images of open-pit mines. The model was improved based on YOLOv11n. In the feature perception stage, a P2 detection head was added and combined with shallow backbone features to construct a fine-grained information pathway, enhancing the detail features of small targets. In the feature modeling stage, the D-C3k2 module incorporating Deformable Large Kernel Attention (DLKA) was introduced to expand the receptive field and improve scale generalization. In the feature fusion stage, a dual-attention mechanism—namely the Convolution and Attention Fusion Module (CAFM) and the Multidimensional Collaborative Attention Module (MCAM)—achieved multidimensional interaction and adaptive weighting of features along the channel and spatial dimensions, thereby highlighting target features and suppressing background interference. In the training optimization stage, a composite loss function combining the Gradient Harmonizing Mechanism Loss (GHM Loss) was adopted, which balanced the gradient contributions of hard and easy samples during training and effectively alleviated the positive–negative sample imbalance problem. Experimental results showed that on a self-built UAV remote-sensing dataset covering multiple types of small mechanical targets in open-pit mines, CDMG-YOLO achieved a precision of 0.859 and an mAP@0.5 of 0.697, with only 3.1×10
6 parameters and 12.7 GFLOPs, achieving a balance between high accuracy, lightweight design, and efficiency. In low-contrast scenes, severe occlusion situations, and densely distributed multi-target scenarios in complex open-pit mine operations, the CDMG-YOLO model achieved accurate target localization and recognition. On the public LEVIR dataset, CDMG-YOLO accurately identified different types of targets and demonstrated good generalization capability.