留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于交叉注意力机制的煤矿井下不安全行为识别

饶天荣 潘涛 徐会军

饶天荣,潘涛,徐会军. 基于交叉注意力机制的煤矿井下不安全行为识别[J]. 工矿自动化,2022,48(10):48-54.  doi: 10.13272/j.issn.1671-251x.17949
引用本文: 饶天荣,潘涛,徐会军. 基于交叉注意力机制的煤矿井下不安全行为识别[J]. 工矿自动化,2022,48(10):48-54.  doi: 10.13272/j.issn.1671-251x.17949
RAO Tianrong, PAN Tao, XU Huijun. Unsafe action recognition in underground coal mine based on cross-attention mechanism[J]. Journal of Mine Automation,2022,48(10):48-54.  doi: 10.13272/j.issn.1671-251x.17949
Citation: RAO Tianrong, PAN Tao, XU Huijun. Unsafe action recognition in underground coal mine based on cross-attention mechanism[J]. Journal of Mine Automation,2022,48(10):48-54.  doi: 10.13272/j.issn.1671-251x.17949

基于交叉注意力机制的煤矿井下不安全行为识别

doi: 10.13272/j.issn.1671-251x.17949
基金项目: 国家能源集团科技创新项目(GJNY-20-159)。
详细信息
    作者简介:

    饶天荣(1990—),男,宁夏银川人,博士,主要研究方向为图像处理和智能调度,E-mail:20074909@ceic.com

    通讯作者:

    潘涛(1975—),男,江苏连云港人,教授级高级工程师,博士,主要研究方向为煤矿智能化技术,E-mail:pacumt@163.com

  • 中图分类号: TD67

Unsafe action recognition in underground coal mine based on cross-attention mechanism

  • 摘要: 对煤矿井下人员不安全行为进行实时视频监控及报警是提升安全生产水平的重要手段。煤矿井下环境复杂,监控视频质量不佳,导致常规基于图像特征或基于人体关键点特征的行为识别方法在煤矿井下应用受限。提出了一种基于交叉注意力机制的多特征融合行为识别模型,用于识别煤矿井下人员不安全行为。针对分段视频图像,采用3D ResNet101模型提取图像特征,采用openpose算法和ST−GCN(时空图卷积网络)提取人体关键点特征;采用交叉注意力机制对图像特征和人体关键点特征进行融合处理,并与经自注意力机制处理后的图像特征和人体关键点特征拼接,得到最终行为识别特征;识别特征经全连接层及归一化指数函数softmax处理后,得到行为识别结果。基于公共数据集HMDB51和UCF101、自建的煤矿井下视频数据集进行行为识别实验,结果表明:采用交叉注意力机制可使行为识别模型更有效地融合图像特征和人体关键点特征,大幅提高识别准确率;与目前应用最广泛的行为识别模型SlowFast相比,基于交叉注意力机制的多特征融合行为识别模型在HMDB51和UCF101数据集上的识别准确率分别提高1.8%,0.9%,在自建数据集上的识别准确率提高6.7%,验证了基于交叉注意力机制的多特征融合行为识别模型更适用于煤矿井下复杂环境中人员不安全行为识别。

     

  • 图  1  基于交叉注意力机制的多特征融合行为识别模型结构

    Figure  1.  Structure of action recognition model of multi-feature fusion based on cross-attention mechanism

    图  2  openpose算法提取人体关键点效果

    Figure  2.  Human body key points extracted by openpose algorithm

    图  3  人体骨架时空图

    Figure  3.  Time-space diagram of the skeleton

    图  4  自注意力机制

    Figure  4.  Self-attention mechanism

    图  5  不同模型对摘安全帽行为的识别结果

    Figure  5.  Recognition results of removing safety helmet by different models

    表  1  不同行为识别模型在公共数据集上的对比实验结果

    Table  1.   Comparison experiment results of different action recognition models by use of public data sets %

    模型准确率
    HMDB51UCF101
    C3D56.882.3
    3D ResNet10161.788.9
    TSN68.593.4
    SlowFast72.395.8
    ST−GCN48.678.3
    2S−AGCN51.880.2
    本文模型74.196.7
    下载: 导出CSV

    表  2  消融实验结果

    Table  2.   Ablation experiment results %

    图像特征
    提取网络
    人体关键点特
    征提取网络
    自注意力
    机制
    交叉注意力
    机制
    准确率
    HMDB51UCF101
    ×××61.788.9
    ××63.389.7
    ×××48.678.3
    ××51.881.4
    ××63.288.5
    ×69.092.7
    74.196.7
    下载: 导出CSV

    表  3  不同行为识别模型在自建数据集上的对比实验结果

    Table  3.   Comparison experiment results of different action recognition models by use of built underground video data sets %

    模型准确率模型准确率
    C3D75.4 ST−GCN63.4
    3D ResNet10178.72S−AGCN70.9
    TSN81.5本文模型91.3
    SlowFast84.6
    下载: 导出CSV

    表  4  本文模型对不同行为类别的识别结果

    Table  4.   Action recognition results of different action types by the proposed model %

    行为类别准确率行为类别准确率
    抽烟93.7 跌倒95.8
    打架91.5摘安全帽84.2
    徘徊93.2脱工作服89.4
    下载: 导出CSV
  • [1] 党伟超,史云龙,白尚旺,等. 基于条件变分自编码器的井下配电室巡检行为检测[J]. 工矿自动化,2021,47(12):98-105. doi: 10.13272/j.issn.1671-251x.2021030087

    DANG Weichao,SHI Yunlong,BAI Shangwang,et al. Inspection behavior detection of underground power distribution room based on conditional variational auto-encoder[J]. Industry and Mine Automation,2021,47(12):98-105. doi: 10.13272/j.issn.1671-251x.2021030087
    [2] 王国法,任怀伟,赵国瑞,等. 煤矿智能化十大“痛点”解析及对策[J]. 工矿自动化,2021,47(6):1-11. doi: 10.13272/j.issn.1671-251x.17808

    WANG Guofa,REN Huaiwei,ZHAO Guorui,et al. Analysis and countermeasures of ten 'pain points' of intelligent coal mine[J]. Industry and Mine Automation,2021,47(6):1-11. doi: 10.13272/j.issn.1671-251x.17808
    [3] SIMONYAN K, ZISSERMAN A. Two-streamconvolutional  networks  for  action  recognition  in videos[Z]. arXiv Preprint, arXiv:1406.2199v2.
    [4] WANG Limin, XIONG Yuanjun, WANG Zhe, et al. Temporal segment networks: towards good practices for deep action recognition[C]. European Conference on Computer Vision, Amsterdam, 2016: 20-36.
    [5] JI Lin, GAN Chuang, HAN Song. TSM: temporal shift module for efficient video understanding[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019: 7083-7093.
    [6] LIU Kun, LIU Wu, GAN Chuang, et al. T-C3D: temporal convolutional 3D network for real-time action recognition[C]. The AAAI Conference on Artificial Intelligence, New Orleans, 2018: 7138-7145.
    [7] FEICHTENHOFER C, FAN H, MALIK J, et al. Slowfast networks for video recognition[C]. The IEEE International Conference on Computer Vision, Long Beach, 2019: 6202-6211.
    [8] 党伟超,张泽杰,白尚旺,等. 基于改进双流法的井下配电室巡检行为识别[J]. 工矿自动化,2020,46(4):75-80. doi: 10.13272/j.issn.1671-251x.2019080074

    DANG Weichao,ZHANG Zejie,BAI Shangwang,et al. Inspection behavior recognition of underground power distribution room based on improved two-stream CNN method[J]. Industry and Mine Automation,2020,46(4):75-80. doi: 10.13272/j.issn.1671-251x.2019080074
    [9] 刘浩,刘海滨,孙宇,等. 煤矿井下员工不安全行为智能识别系统[J]. 煤炭学报,2021,46(增刊2):1159-1169. doi: 10.13225/j.cnki.jccs.2021.0670

    LIU Hao,LIU Haibin,SUN Yu,et al. Intelligent recognition system of unsafe behavior of underground coal miners[J]. Journal of China Coal Society,2021,46(S2):1159-1169. doi: 10.13225/j.cnki.jccs.2021.0670
    [10] 张立亚. 基于图像识别的煤矿井下安全管控技术[J]. 煤矿安全,2021,52(2):165-168. doi: 10.13347/j.cnki.mkaq.2021.02.032

    ZHANG Liya. Safety control technology of coal mine based on image recognition[J]. Safety in Coal Mines,2021,52(2):165-168. doi: 10.13347/j.cnki.mkaq.2021.02.032
    [11] YAN Sijie, XIONG Yuanjun, LIN Dahua. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]. The AAAI Conference on Artificial Intelligence, New Orleans, 2018: 7444-7452.
    [12] SHI Lei, ZHANG Yifan, CHENG Jian, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019: 12026-12035.
    [13] 黄瀚,程小舟,云霄,等. 基于DA−GCN的煤矿人员行为识别方法[J]. 工矿自动化,2021,47(4):62-66. doi: 10.13272/j.issn.1671-251x.17721

    HUANG Han,CHENG Xiaozhou,YUN Xiao,et al. DA-GCN-based coal mine personnel action recognition method[J]. Industry and Mine Automation,2021,47(4):62-66. doi: 10.13272/j.issn.1671-251x.17721
    [14] 王璇,吴佳奇,阳康,等. 煤矿井下人体姿态检测方法[J]. 工矿自动化,2022,48(5):79-84. doi: 10.13272/j.issn.1671-251x.17867

    WANG Xuan,WU Jiaqi,YANG Kang,et al. Human posture detection method in coal mine[J]. Journal of Mine Automation,2022,48(5):79-84. doi: 10.13272/j.issn.1671-251x.17867
    [15] HARA K, KATAOKA H, SATOH Y. Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and imagenet[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018: 6546-6555.
    [16] CAO Zhe, SIMON T, WEI S-E, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]. The IEEE International Conference on Computer Vision, Honolulu, 2017: 7291-7299.
    [17] WANG Xiaolong, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]. The IEEE International Conference on Computer Vision, Salt Lake City, 2018: 7794-7803.
    [18] VELICKOVIC P, CUCURULL G, CASANOVA A, et al. Graph attention networks[Z]. arXiv Preprint, arXiv: 1710.10903.
    [19] KUEHNE H, JHUANG H, GARROTE E, et al. HMDB: a large video database for human motion recognition[C]. International Conference on Computer Vision, Barcelona, 2011: 2556-2563.
    [20] SOOMORO K, ZAMIR A R, SHAH M. UCF101: a dataset of 101 human actions classes from videos in the wild[Z]. arXiv Preprint, arXiv: 1212.0402.
  • 加载中
图(5) / 表(4)
计量
  • 文章访问数:  369
  • HTML全文浏览量:  42
  • PDF下载量:  77
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-05-18
  • 修回日期:  2022-10-08
  • 网络出版日期:  2022-10-22

目录

    /

    返回文章
    返回