井下矿工多目标检测与跟踪联合算法

周孟然; 李学松; 朱梓伟; 黄凯文

doi:10.13272/j.issn.1671-251x.2022060040

井下矿工多目标检测与跟踪联合算法

doi: 10.13272/j.issn.1671-251x.2022060040

安徽理工大学电气与信息工程学院, 安徽淮南　232001

基金项目: 国家重点研发计划项目（2018YFC0604503）；安徽省自然科学基金资助项目（2008085UD06）。

详细信息

作者简介:
周孟然(1965—)，男，安徽淮南人, 教授，博士，博士研究生导师，研究方向为矿山机电系统监测、光电信息处理、煤矿安全监测监控，E-mail：mrzhou8521@163.com

中图分类号: TD67
计量
- 文章访问数: 278
- HTML全文浏览量: 41
- PDF下载量: 43
- 被引次数: 0
出版历程
- 收稿日期: 2022-06-13
- 修回日期: 2022-09-24
- 网络出版日期: 2022-08-12

A joint algorithm of multi-target detection and tracking for underground miners

School of Electrical and Information Engineering, Anhui University of Science and Technology, Huainan 232001, China

摘要

摘要: 针对现有的煤矿井下矿工多目标跟踪算法检测速度慢、识别精度低等问题，提出了一种基于改进YOLOv5s模型与改进Deep SORT算法的多目标检测与跟踪联合算法。多目标检测部分，在YOLOv5s的基础上进行改进，得到YOLOv5s−GAD模型：引入幻象瓶颈卷积（GhostConv）模块和深度可分离卷积（DWConv）模块，分别替换YOLOv5s模型骨干网络和路径聚合网络中的BottleneckCSP模块，以提高特征提取速度；针对井下光线暗、图像噪点多等特点，在最小特征图中引入高效通道注意力神经网络（ECA−Net）模块，以提高模型整体精度。多目标跟踪部分，使用全尺度网络（OSNet）替换Deep SORT中的浅层残差网络进行全方位特征学习，以更好地实现行人重识别，提高目标跟踪的准确性。实验结果表明：在自定义数据集Miner21上，YOLOv5s−GAD模型的平均精度（交并比为0.5时）达97.8%，帧率达140.2 帧/s，多目标检测效果优于常用的Faster RCNN，YOLOv3，YOLOv5s模型；在公开行人数据集MOT17上，多目标检测与跟踪联合算法的速度与准确率等综合性能优于IOU17，Deep SORT等常用多目标跟踪算法，人员身份转换次数最少，行人重识别效果最好；采用井下矿工多目标检测与跟踪联合算法能够及时检测并跟踪井下矿工，多目标跟踪效果良好。
- 煤矿安全 /
- 多目标检测与跟踪 /
- 行人重识别 /
- YOLOv5s /
- YOLOv5s−GAD /
- Deep SORT /
- 全尺度网络
Abstract: The existing multi-target tracking algorithms for underground miners has the problems of slow detection speed and low recognition precision. In order to solve the above problems, a joint algorithm of multi-target detection and tracking algorithm based on the improved YOLOv5s model and the improved Deep SORT algorithm is proposed. In the part of multi-target detection, the YOLOv5s-GAD model is obtained by improving YOLOv5s model. The GhostConv module and the depthwise separable convolution (DWConv) module are introduced to replace the BottleneckCSP module in the YOLOv5s model backbone network and path aggregation network respectively. Therefore, the feature extraction speed is improved. Considering the characteristics of dark underground light and many noisy images, the efficient channel attention neural network (ECA-Net) module is introduced into the minimum feature map to improve the model's overall precision. In the part of multi-target tracking, the omni-scale network (OSNet) is used to replace the shallow residual network in Deep SORT to carry out omni-directional feature learning. Therefore, pedestrian re-identification and target tracking precision are improved. The experimental result shows that on the custom dataset Miner21, the YOLOv5s-GAD model average preciscom (when the intersection of union ratio is 0.5) reaches 97.8%, and the frame rate reaches 140.2 frames/s. The multi-target detection effect is better than the commonly used Faster RCNN, YOLOv3 and YOLOv5s models. On the public miners dataset MOT17, the speed and accuracy of the multi-target detection and tracking joint algorithm are better than those of IOU17, Deep SORT and other common multi-target tracking algorithms. The proposed model has the least number of personnel identity conversions and the best miner re-recognition effect. The joint algorithm of multi-target detection and tracking for underground miners can detect and track underground miners in time, and the multi-target tracking effect is good.
- coal mine safety /
- multi-target detection and tracking /
- miner re-recognition /
- YOLOv5s /
- YOLOv5s-GAD /
- Deep SORT /
- omni-scale network

HTML全文

图 1 井下矿工多目标检测与跟踪联合算法流程

Figure 1. Flow of joint algorithm of multi-target detection and tracking for underground miners

下载: 全尺寸图片幻灯片

图 2 YOLOv5s−GAD模型

Figure 2. YOLOv5s-GAD model

下载: 全尺寸图片幻灯片

图 3 标准卷积过程

Figure 3. Standard convolution process

下载: 全尺寸图片幻灯片

图 4 DWConv过程

Figure 4. Depthwise separable convolution process

下载: 全尺寸图片幻灯片

图 5 OSNet结构

Figure 5. Omni-scale network structure

下载: 全尺寸图片幻灯片

图 6 数据集图像

Figure 6. Dataset image

下载: 全尺寸图片幻灯片

图 7 各模型训练过程

Figure 7. Training process of each model

下载: 全尺寸图片幻灯片

图 8 各种目标检测模型效果对比

Figure 8. Comparison of effects of various target detection models

下载: 全尺寸图片幻灯片

图 9 井下矿工多目标检测与跟踪结果

Figure 9. Multi-target detection and tracking results of underground miners

下载: 全尺寸图片幻灯片

表 1 不同模型消融实验结果

Table 1. Ablation experiment results of different models

模型	图像尺寸/像素	参数量/ 10⁶个	计算量/ byte	A_P/%	帧率/ (帧·s⁻¹)
基准网络	640×640	7.2	16.5	96.6	56.3
加入 GhostConv	640×640	5.5	9.6	95.9	98.6
加入 GhostConv, DWConv	640×640	0.7	3.5	94.5	165.1
加入 ECA−Net	640×640	7.8	18.2	98.2	47.2
加入GhostConv, DWConv , ECA−Net	640×640	1.2	4.2	97.8	140.2

下载: 导出CSV

表 2 目标检测模型实验结果

Table 2. Experimental results of target detection models

模型	图像尺寸/像素	参数量/ 10⁶个	计算量/ byte	A_P/%	帧率/ (帧·s⁻¹)
Faster RCNN	600×600	84.0	200.0	98.3	8.4
YOLOv3	640×640	32.0	79.6	72.9	20.4
YOLOv5s	640×640	7.2	16.5	96.6	56.3
YOLOv5s−GAD	640×640	1.2	4.2	97.8	140.2

下载: 导出CSV

表 3 多目标检测与跟踪联合算法实验结果

Table 3. Experimental results of joint algorithms of multi-target detection and tracking

算法	A/%	R/%	I	T/%	L/%	帧率/(帧·s⁻¹)
IOU17	45.5	39.4	5 988	15.7	40.5	147.8
MOTDT17	50.9	52.7	2 474	17.5	35.7	20.6
Deep SORT	60.3	61.2	2 442	31.5	20.3	20.0
FairMOT	73.7	72.3	3 303	43.2	17.3	25.9
本文算法	55.2	54.2	1 523	20.0	35.5	88.0

下载: 导出CSV

参考文献(28)

[1]	张立亚. 基于动目标特征提取的矿井目标监测[J]. 煤炭学报,2017,42(增刊2):603-610. doi: 10.13225/j.cnki.jccs.2017.1333 ZHANG Liya. Mine target monitoring based on feature extraction of moving target[J]. Journal of China Coal Society,2017,42(S2):603-610. doi: 10.13225/j.cnki.jccs.2017.1333
[2]	刘艺,李蒙蒙,郑奇斌,等. 视频目标跟踪算法综述[J]. 计算机科学与探索,2022,16(7):1504-1515. doi: 10.3778/j.issn.1673-9418.2111105 LIU Yi,LI Mengmeng,ZHENG Qibin,et al. Survey on video object tracking algorithms[J]. Journal of Frontiers of Computer Science and Technology,2022,16(7):1504-1515. doi: 10.3778/j.issn.1673-9418.2111105
[3]	CIAPARRONE G,SANCHEZ F L,TABIK S,et al. Deep learning in video multi-object tracking:a survey[J]. Neurocomputing,2020,381:61-88. doi: 10.1016/j.neucom.2019.11.023
[4]	JIANG Daihong,DAI Lei,LI Dan,et al. Moving-object tracking algorithm based on PCA-SIFT and optimization for underground coal mines[J]. IEEE Access,2019,7:35556-35563. doi: 10.1109/ACCESS.2019.2899362
[5]	孔丽丽,易春求. 矿用智能安全头盔的设计[J]. 中国矿业,2020,29(12):95-98,115. KONG Lili,YI Chunqiu. Design of mine intelligent safety helmet[J]. China Mining Magazine,2020,29(12):95-98,115.
[6]	郭曦,谢炜,朱红秀,等. 井下目标跟踪与测距方法研究[J]. 煤炭工程,2019,51(3):117-121. GUO Xi,XIE Wei,ZHU Hongxiu,et al. Target tracking and ranging method in underground mine[J]. Coal Engineering,2019,51(3):117-121.
[7]	REN Shaoqing,HE Kaiming,GIRSHICK R,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031
[8]	王琳,卫晨,李伟山,等. 结合金字塔池化模块的YOLOv2的井下行人检测[J]. 计算机工程与应用,2019,55(3):133-139. doi: 10.3778/j.issn.1002-8331.1710-0236 WANG Lin,WEI Chen,LI Weishan,et al. Pedestrian detection based on YOLOv2 with pyramid pooling module in underground coal mine[J]. Computer Engineering and Applications,2019,55(3):133-139. doi: 10.3778/j.issn.1002-8331.1710-0236
[9]	REDMON J, FARHADI A. Yolov3: an incremental improvement[EB/OL]. [2022-05-10]. https://arxiv.org/abs/1804.02767.
[10]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: optimal speed and accuracy of object detection[EB/OL]. [2022-05-10]. https://arxiv.org/abs/2004.10934.
[11]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]. IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016: 779-788.
[12]	BEWLEY A, GE Z, OTT L, et al. Simple online and real-time tracking[C]. IEEE International Conference on Image Processing, Phoenix, 2016: 3464-3468.
[13]	WOJKE N, BEWLEY A, PAULUS D. Simple online and real-time tracking with a deep association metric[C]. IEEE International Conference on Image Processing, Beijing, 2017: 3645-3649.
[14]	ZHOU Kaiyang, YANG Yongxin, CAVALLARO A, et al. Omni-scale feature learning for person re-identification[EB/OL]. [2022-05-10]. https://arxiv.org/abs/1905.00953.
[15]	孙彦景,魏力,张年龙,等. 联合DD−GAN和全局特征的井下人员重识别方法[J]. 西安电子科技大学学报,2021,48(5):201-211. doi: 10.19665/j.issn1001-2400.2021.05.023 SUN Yanjing,WEI Li,ZHANG Nianlong,et al. Person re-identification method combining the DD-GAN and global feature in a coal mine[J]. Journal of Xidian University,2021,48(5):201-211. doi: 10.19665/j.issn1001-2400.2021.05.023
[16]	HE Kaiming,ZHANG Xiangyu,REN Shaoqing,et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916. doi: 10.1109/TPAMI.2015.2389824
[17]	LIU Shu, QI Lu, QIN Haifang, et al. Path aggregation network for instance segmentation[EB/OL]. [2022-05-10]. https://arxiv.org/abs/1803.01534.
[18]	HAN Kai, WANG Yunhe, TIAN Qi, et al. Ghostnet: more features from cheap operations[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2020: 1580-1589.
[19]	HOWARD A G, ZHU Menglong, CHEN Bo, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2022-05-10]. https://arxiv.org/abs/1704.04861.
[20]	GUO Menghao,XU Tianxing,LIU Jiangjiang,et al. Attention mechanisms in computer vision:a survey[J]. Computational Visual Media,2022(3):331-368.
[21]	CHAUDHAN S,MITHAL V,POLATKAN G,et al. An attentive survey of attention models[J]. ACM Transactions on Intelligent Systems and Technology,2021,12(5):1-32.
[22]	张宸嘉,朱磊,俞璐. 卷积神经网络中的注意力机制综述[J]. 计算机工程与应用,2021,57(20):64-72. ZHANG Chenjia,ZHU Lei,YU Lu. Review of attention mechanism in convolutional neural networks[J]. Computer Engineering and Applications,2021,57(20):64-72.
[23]	WANG Qilong, WU Banggu, ZHU Pengfei, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2020: 11531-11539.
[24]	陈翰林. 基于YOLO v4−tiny的煤矿职工井下违章行为识别的研究[D]. 淮南: 安徽理工大学, 2021. CHEN Hanlin. Identification of underground violation behavior of coal mine workers based on YOLO v4-tiny[D]. Huainan: Anhui University of Science and Technology, 2021.
[25]	BOVHINSKI E, EISELEIN V, SIKORA T. High-speed tracking-by-detection without using image information[C]. 14th IEEE International Conference on Advanced Video and Signal Based Surveillance, Lecce, 2017: 1-6.
[26]	CHEN Long, AI Haizhou, ZHUANG Zijie, et al. Real-time multiple people tracking with deeply learned candidate selection and person re-identification[C]. IEEE International Conference on Multimedia and Expo, San Diego, 2018: 1-6.
[27]	ZHANG Yifu,WANG Chunyu,WANG Xinggang,et al. Fairmot:on the fairness of detection and re-identification in multiple object tracking[J]. International Journal of Computer Vision,2021,129(11):3069-3087. doi: 10.1007/s11263-021-01513-4
[28]	BERNARDIN K, STIEFELHAGEN R. Evaluating multiple object tracking performance: the clear mot metrics[J]. EURASIP Journal on Image and Video Processing, 2008. DOI: 10.1155/2008/246309.