留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于长短期存储的聚合增强型煤矸石视频识别模型

杨军

杨军. 基于长短期存储的聚合增强型煤矸石视频识别模型[J]. 工矿自动化,2023,49(3):39-44, 62.  doi: 10.13272/j.issn.1671-251x.18058
引用本文: 杨军. 基于长短期存储的聚合增强型煤矸石视频识别模型[J]. 工矿自动化,2023,49(3):39-44, 62.  doi: 10.13272/j.issn.1671-251x.18058
YANG Jun. Aggregation enhanced coal-gangue video recognition model based on long and short-term storage [J]. Journal of Mine Automation,2023,49(3):39-44, 62.  doi: 10.13272/j.issn.1671-251x.18058
Citation: YANG Jun. Aggregation enhanced coal-gangue video recognition model based on long and short-term storage [J]. Journal of Mine Automation,2023,49(3):39-44, 62.  doi: 10.13272/j.issn.1671-251x.18058

基于长短期存储的聚合增强型煤矸石视频识别模型

doi: 10.13272/j.issn.1671-251x.18058
基金项目: 陕西省秦创原“科学家+工程师”队伍建设项目(2022KXJ-38)。
详细信息
    作者简介:

    杨军(1982—),男,宁夏平罗人,工程师,主要从事煤矿智能化技术研究工作,E-mail:273364857@qq.com

  • 中图分类号: TD67

Aggregation enhanced coal-gangue video recognition model based on long and short-term storage

  • 摘要: 采用煤矸石图像识别技术进行煤矸石识别会错过一些关键目标的识别。视频目标识别模型比图像目标识别模型更贴近煤矸石识别分选场景需求,对视频数据中的煤矸石特征可以进行更广泛、更有深度的提取。但目前煤矸石视频目标识别技术未考虑视频帧重复性、帧间相似性、关键帧偶然性对模型性能的影响。针对上述问题,提出了一种基于长短期存储(LSS)的聚合增强型煤矸石视频识别模型。首先,采用关键帧与非关键帧对海量信息进行初筛。对煤矸石视频帧序列进行多帧聚合,通过时空关系网络 (TRN)将关键帧与相邻帧特征信息相聚合,建立长期视频帧和短期视频帧,在不丢失关键特征信息的同时减少模型计算量。然后,采用语义相似性权重、可学习权重和感兴趣区域(ROI)相似性权重融合的注意力机制,对长期视频帧、短期视频帧与关键帧之间的特征进行权重再分配。最后,设计用于存储增强的LSS模块,对长期视频帧与短期视频帧进行有效特征存储,并在关键帧识别时加以融合,增强关键帧特征的表征能力,以实现煤矸石识别。基于枣泉选煤厂自建煤矸石视频数据集对该模型进行实验验证,结果表明:相较于记忆增强全局−局部聚合(MEGA)网络、基于流引导的特征聚合视频目标检测(FGFA)、关系蒸馏网络(RDN)、视频识别的深度特征流(DFF)模型,基于LSS的聚合增强型煤矸石视频识别模型的平均精度均值优于其他模型,为77.12%;模型视频目标运动速度与识别精度呈负相关,基于LSS的聚合增强型煤矸石视频识别模型在慢速运动的目标检测上识别精度最高达83.82%。

     

  • 图  1  煤矸石视频中的关键帧与非关键帧

    Figure  1.  Key frames and non-key frames in coal-gangue video

    图  2  关键帧选取框架

    Figure  2.  Key frame selection frame

    图  3  基于LSS的聚合增强型煤矸石视频识别模型

    Figure  3.  Aggregation enhanced coal-gangue video recognition model based on LSS

    图  4  TRN特征融合过程

    Figure  4.  The TRN feature fusion process

    图  5  注意力机制计算原理

    Figure  5.  The tational principle of attention mechanism

    图  6  LSS模块设计原理

    Figure  6.  The LSS module design principle

    图  7  损失函数曲线

    Figure  7.  Loss function curve

    表  1  本文模型与MEGA,FGFA,RDN,DFF模型mAP对比

    Table  1.   The mAP comparison of the proposed model and MEGA,FGFA,RDN,DFF models %

    模型识别精度mAP
    快速运动目标中速运动目标慢速运动目标
    本文模型55.1276.0283.8277.12
    MEGA-10155.6376.2482.3976.65
    MEGA-5049.5370.5879.8372.63
    RDN-10151.6571.9582.1074.68
    RDN-5045.2767.4680.2270.40
    FGFA-10143.9769.8681.0771.91
    FGFA-5040.7566.5778.9068.68
    DFF-10137.4766.8779.3268.42
    DFF-5035.6562.1974.1463.50
    下载: 导出CSV
  • [1] SHARMA V,GUPTA M,KUMAR A,et al. Video processing using deep learning techniques:a systematic literature review[J]. IEEE Access,2021,9:139489-139507. doi: 10.1109/ACCESS.2021.3118541
    [2] AICH A, ZHENG M, KAEANAM S, et al. Spatio-temporal representation factorization for video-based person re-identification[C]. International Conference on Computer Vision, Montreal, 2021: 152-162.
    [3] 孙立新. 基于卷积神经网络的煤矸石识别方法研究[D]. 邯郸: 河北工程大学, 2020.

    SUN Lixin. Research on coal gangue recognition method based on convolutional neural network[D]. Handan: Hebei University of Engineering, 2020.
    [4] PAN Hongguang,SHI Yuhong,LEI Xinyu,et al. Fast identification model for coal and gangue based on the improved tiny YOLO V3[J]. Journal of Real-Time Image Processing,2022,19(3):687-701. doi: 10.1007/s11554-022-01215-1
    [5] ZHU Xizhou, WANG Yujie, DAI Jifeng, et al. Flow-guided feature aggregation for video object detection[C]. IEEE International Conference on Computer Vision, Venice, 2017, 408-417.
    [6] 张勇. 基于视频处理的煤矸石识别研究[D]. 徐州: 中国矿业大学, 2018.

    ZHANG Yong. Research on gangue identification based on video processing[D]. Xuzhou: China University of Mining and Technology, 2018.
    [7] 程健,王东伟,杨凌凯,等. 一种改进的高斯混合模型煤矸石视频检测方法[J]. 中南大学学报(自然科学版),2018,49(1):118-123.

    CHENG Jian,WANG Dongwei,YANG Lingkai,et al. An improved Gaussian mixture model for coal gangue video detection[J]. Journal of Central South University (Science and Technology),2018,49(1):118-123.
    [8] LEI Xinyu,PAN Hongguang,HUANG Xiangdong. A dilated CNN model for image classification[J]. IEEE Access,2019,7:124087-124095. doi: 10.1109/ACCESS.2019.2927169
    [9] PAN Hongguang,WEN Fan,HUANG Xiangdong,et al. The enhanced deep plug-and-play super-resolution algorithm with residual channel attention networks[J]. Journal of Intelligent & Fuzzy Systems:Applicationgs in Engineering and Technology,2021,41(2):4069-4078.
    [10] ZHU Xizhou, DAI Jifeng, YUAN Lu, et al. Towards high performance video object detection[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018: 7210-7218.
    [11] WANG Shiyao, ZHOU Yucong, YAN Junjie, et al. Fully motion-aware network for video object detection[C]. Proceedings of the European Conference on Computer Vision, Munich, 2018: 542-557.
    [12] WU Haiping, CHEN Yuntao, WANG Naiyan, et al. Sequence level semantics aggregation for video object detection[C]. The IEEE/CVF International Conference on Computer Vision, Seoul, 2019: 9217-9225.
    [13] FEICHTENHOFER C, PINZ A, ZISSERMAN A. Detect to track and track to detect[C]. The IEEE International Conference on Computer Vision, Venice, 2017: 3038-3046.
    [14] ZHOU Bolei, ANDONIAN A, TORRALBA A. Temporal relational reasoning in videos[C]. European Conference on Computer Vision, Munich, 2018: 803-818.
    [15] LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]. IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 936-944.
    [16] HE Kaiming, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]. International Conference on Computer Vision, Venice, 2017: 2961-2969.
    [17] REN Shaoqing,HE Kaiming,GIRSHICK R,et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031
    [18] CHEN Yihong, CAO Yue, HU Han, et al. Memory enhanced global-local aggregation for video object detection[C]. IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2020: 10337-10346.
    [19] GIRSHICK R. Fast R-CNN[C]. IEEE International Conference on Computer Vision, Santiago, 2015, 1440-1448.
    [20] HOCHREITER S,SCHMIDHUBER J. Long short-term memory[J]. Neural Computation,1997,9(8):1735-1780. doi: 10.1162/neco.1997.9.8.1735
    [21] DENG Jiajun, PAN Yingwei, YAO Ting, et al. Relation distillation networks for video object detection[C]. The IEEE/CVF International Conference on Computer Vision, Seoul, 2019: 7023-7032.
    [22] ZHU Xizhou, XIONG Yuwen, DAI Jifeng, et al. Deep feature flow for video recognition[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 2349-2358.
  • 加载中
图(7) / 表(1)
计量
  • 文章访问数:  111
  • HTML全文浏览量:  31
  • PDF下载量:  13
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-11-22
  • 修回日期:  2023-02-20
  • 网络出版日期:  2023-03-27

目录

    /

    返回文章
    返回