Coal gangue audio classification method based on improved EfficientNet
-
摘要:
针对煤矸音频特征提取过程中设备运行噪声干扰严重及单一提取方法易导致信息丢失的问题,提出了一种基于改进EfficientNet的煤矸音频分类方法。采用基于Mel频谱和Gammatone倒谱系数的特征提取方法,有效捕捉矸石声音中的低频信息和细节特征。选择EfficientNet−B0作为骨干网络,并对其进行以下改进:将原有的多尺度通道注意力模块换成卷积块注意力模块,得到卷积注意力特征融合(CAFF)模块,通过网络自学习为不同空间位置的特征分配不同的权重信息,生成新的有效特征;在原有的MBConv模块中并行嵌入频域通道注意力(FCA)模块,加强特征图的表达能力,从而提高整个网络的性能。实验结果表明:引入CAFF模块后,模型准确率提升了0.61%,F1得分提升了0.52%,且模型收敛更快,说明CAFF模块有效提升了模型对频谱特征的捕捉能力;引入FCA模块后,准确率提升了0.45%,F1得分提升了0.62%,说明模块的叠加可以进一步提高模型的泛化能力和处理复杂特征的能力;改进EfficientNe模型的准确率为91.90%,标准差为0.108,显著优于同类对比音频分类模型。
-
关键词:
- 综放开采 /
- 煤矸识别 /
- 音频特征提取 /
- EfficientNet /
- Mel频谱特征 /
- Gammatone倒谱系数 /
- 注意力机制
Abstract:To address the issues of severe interference of equipment operating noise and information loss caused by single extraction methods during coal gangue audio feature extraction, a coal gangue audio classification method based on improved EfficientNet is proposed. The method adopted a feature extraction approach combining Mel spectrogram and Gammatone frequency cepstral coefficients to effectively capture low-frequency information and detailed features in gangue audio. EfficientNet-B0 was selected as the backbone network, and the following improvements were made: the original multi-scale channel attention module was replaced with a convolutional block attention module, resulting in the Convolutional Attention Feature Fusion (CAFF) module. This module allowed the network to autonomously assign different weight information to features in different spatial positions, generating new effective features. Additionally, a Frequency-domain Channel Attention (FCA) module was embedded in parallel within the original MBConv module, strengthening the representation ability of feature maps and thereby improving overall network performance. The experimental results demonstrated that after introducing the CAFF module, the model's accuracy improved by 0.61%, the F1 score increased by 0.52%, and convergence was faster, indicating that the CAFF module effectively enhanced the model's ability to capture spectral features. After integrating the FCA module, accuracy improved by 0.45%, and the F1 score increased by 0.62%, showing that combining these modules further enhanced the model's generalization ability and its ability to process complex features. The improved EfficientNet model achieved an accuracy of 91.90%, with a standard deviation of 0.108, significantly outperforming other comparable audio classification models.
-
-
表 1 煤矸音频数据集1
Table 1 Coal gangue audio dataset 1
编号 类别 样本数量 0 矸+采煤机右部 210 1 矸+采煤机左部 210 2 矸+后部刮板输送机 210 3 矸+前部刮板输送机 210 4 矸+转载机 210 5 煤+采煤机右部 210 6 煤+采煤机左部 210 7 煤+前部刮板输送机 210 8 煤+后部刮板输送机 210 9 煤+转载机 210 表 2 煤矸音频数据集2
Table 2 Coal gangue audio dataset 2
编号 类别 样本数量 0 煤+采煤机+刮板输送机+转载机 500 1 矸+采煤机+刮板输送机+转载机 500 表 3 消融实验指标
Table 3 Ablation experiment indicators
% 模型编号 模型 特征 准确率 精确率 召回率 F1得分 A EfficientNet(backbone) Mel频谱 89.70 88.07 91.43 89.73 B EfficientNet(backbone) Mel频谱+GFCC 90.84 89.77 91.90 90.82 C EfficientNet(backbone)+CAFF Mel频谱+GFCC 91.45 90.47 92.23 91.34 D EfficientNet(backbone)+CAFF+FCA Mel频谱+GFCC 91.90 91.90 92.01 91.96 -
[1] 王家臣,刘云熹,李杨,等. 矿业系统工程60年发展与展望[J]. 煤炭学报,2024,49(1):261-279. WANG Jiachen,LIU Yunxi,LI Yang,et al. 60 years development and prospect of mining systems engineering[J]. Journal of China Coal Society,2024,49(1):261-279.
[2] 杨金燕,杨锴,田丽燕,等. 我国矿山生态环境现状及治理措施[J]. 环境科学与技术,2012,35(增刊2):182-188. YANG Jinyan,YANG Kai,TIAN Liyan,et al. Environmental impacts of mining activities in China and the corresponding management and remediation strategies:an overview[J]. Environmental Science & Technology,2012,35(S2):182-188.
[3] 王国法. 煤矿智能化最新技术进展与问题探讨[J]. 煤炭科学技术,2022,50(1):1-27. DOI: 10.3969/j.issn.0253-2336.2022.1.mtkxjs202201001 WANG Guofa. New technological progress of coal mine intelligence and its problems[J]. Coal Science and Technology,2022,50(1):1-27. DOI: 10.3969/j.issn.0253-2336.2022.1.mtkxjs202201001
[4] 张强,王海舰,郭桐,等. 基于截齿截割红外热像的采煤机煤岩界面识别研究[J]. 煤炭科学技术,2017,45(5):22-27. ZHANG Qiang,WANG Haijian,GUO Tong,et al. Study on coal-rock interface recognition of coal shearer based on cutting infrared thermal image of picks[J]. Coal Science and Technology,2017,45(5):22-27.
[5] 刘富强,钱建生,王新红,等. 基于图像处理与识别技术的煤矿矸石自动分选[J]. 煤炭学报,2000,25(5):534-537. DOI: 10.3321/j.issn:0253-9993.2000.05.020 LIU Fuqiang,QIAN Jiansheng,WANG Xinhong,et al. Automatic separation of waste rock in coal mine based on image procession and recognition[J]. Journal of China Coal Society,2000,25(5):534-537. DOI: 10.3321/j.issn:0253-9993.2000.05.020
[6] 高琳,于鹏伟,董红娟,等. 基于机器视觉的煤矸石识别方法综述[J]. 科学技术与工程,2024,24(26):11039-11049. DOI: 10.12404/j.issn.1671-1815.2307053 GAO Lin,YU Pengwei,DONG Hongjuan,et al. Review of coal gangue recogntion methods of based on machine vision[J]. Science Technology and Engineering,2024,24(26):11039-11049. DOI: 10.12404/j.issn.1671-1815.2307053
[7] 张锦旺,王家臣,何庚,等. 液体介入提升煤矸识别效率的试验研究[J]. 煤炭学报,2021,46(增刊2):681-691. ZHANG Jinwang,WANG Jiachen,HE Geng,et al. An experimental study on the improvement of coal and gangue identification efficiency by liquid intervention[J]. Journal of China Coal Society,2021,46(S2):681-691.
[8] 袁源,汪嘉文,朱德昇,等. 顶煤放落过程煤矸声信号特征提取与分类方法[J]. 矿业科学学报,2021,6(6):711-720. YUAN Yuan,WANG Jiawen,ZHU Desheng,et al. Feature extraction and classification method of coal gangue acoustic signal during top coal caving[J]. Journal of Mining Science and Technology,2021,6(6):711-720.
[9] 蒋磊,马六章,杨克虎,等. 基于MFCC和FD−CNN卷积神经网络的综放工作面煤矸智能识别[J]. 煤炭学报,2020,45(增刊2):1109-1117. JIANG Lei,MA Liuzhang,YANG Kehu,et al. Intelligent identification of coal gangue in fully mechanized top-coal caving face based on MFCC and FD-CNN convolutional neural network[J]. Journal of China Coal Society,2020,45(S2):1109-1117.
[10] 李富强,李昕. 放顶煤工艺中声学场景识别研究[J]. 中国煤炭,2023,49(2):82-88. LI Fuqiang,LI Xin. Research on acoustic scene recognition in top-coal caving process[J]. China Coal,2023,49(2):82-88.
[11] 陈旭. 基于听觉感知原理的综放工作面垮落煤矸识别方法研究[D]. 徐州:中国矿业大学,2022. CHEN Xu. Research on recognition method of caving coal gangue in fully mechanized caving face based on the principle of auditory perception[D]. Xuzhou:China University of Mining and Technology,2022.
[12] 杨政,王世博,饶柱石,等. 基于听觉特征融合的煤矸识别方法研究[J]. 振动与冲击,2024,43(8):136-144. YANG Zheng,WANG Shibo,RAO Zhushi,et al. Research on coal and gangue recognition method based on auditory feature fusion[J]. Journal of Vibration and Shock,2024,43(8):136-144.
[13] SI Lei,LI Jiahao,WANG Zhongbin,et al. A novel coal-gangue recognition method for top coal caving face based on IALO-VMD and improved MobileNetV2 network[J]. IEEE Transactions on Instrumentation Measurement,2023,72. DOI: 10.1016/j.measurement.2024.115730.
[14] 窦希杰,王世博,刘后广,等. 基于EMD特征提取与随机森林的煤矸识别方法[J]. 工矿自动化,2021,47(3):60-65. DOU Xijie,WANG Shibo,LIU Houguang,et al. Coal and gangue identification method based on EMD feature extraction and random forest[J]. Industry and Mine Automation,2021,47(3):60-65.
[15] HU Shipeng,CHU Yihang,WEN Zhifang,et al. Deep learning bird song recognition based on MFF-ScSEnet[J]. Ecological Indicators,2023,154. DOI: 10.1016/j.ecolind.2023.110844.
[16] BOLD N,ZHANG Chao,AKASHI T. Cross-domain deep feature combination for bird species classification with audio-visual data[J]. IEICE Transactions on Information and Systems,2019(10):2033-2042.
[17] KUMAR SWAIN B,ZUBAIR KHAN M,LAL CHOWDHARY C,et al. SRC:superior robustness of COVID-19 detection from noisy cough data using GFCC[J]. Computer Systems Science and Engineering,2023,46(2):2337-2349. DOI: 10.32604/csse.2023.036192
[18] 王娅茹,唐璐,陈爱斌,等. 基于轻量级LPDMR−NET的鸟鸣声识别方法[J]. 计算机工程,2024,50(10):174-184. WANG Yaru,TANG Lu,CHEN Aibin,et al. Birdsong recognition method based on lightweight LPDMR-NET[J]. Computer Engineering,2024,50(10):174-184.
[19] 吴晏辰,王英民. 基于Gammatone频率倒谱系数的舰船辐射噪声分析[J]. 水下无人系统学报,2021,29(1):60-64. WU Yanchen,WANG Yingmin. Ship-radiated noise analysis based on the gammatone frequency cepstrum coefficient[J]. Journal of Unmanned Undersea Systems,2021,29(1):60-64.
[20] DAI Yimian,GIESEKE F,OEHMCKE S,et al. Attentional feature fusion[C]. IEEE Winter Conference on Applications of Computer Vision,Waikoloa,2021:3560-3569.
[21] QIN Zequn,ZHANG Pengyi,WU Fei,et al. FcaNet:frequency channel attention networks[C]. IEEE/CVF International Conference on Computer Vision,Montreal,2021:783-792.
[22] SALAMON J,JACOBY C,BELLO J P,et al. A dataset and taxonomy for urban sound research[C]. The 22nd ACM International Conference on Multimedia,Orlando,2014:1041-1044.
[23] AKBAL E,TUNCER T,DOGAN S. Vehicle interior sound classification based on local quintet magnitude pattern and iterative neighborhood component analysis[J]. Applied Artificial Intelligence,2022,36(1). DOI: 10.1080/08839514.2022.2137653.
-
期刊类型引用(10)
1. 刘敏,顾清华,王倩. 优化综合成本的露天矿油电混编卡车运输优化研究. 采矿技术. 2024(05): 286-292 . 百度学术
2. 文家燕,闻海潮,程洋,罗绍猛,何伟朝. 基于GWO-NSGA-Ⅱ混合算法的露天矿低碳运输调度. 工矿自动化. 2023(02): 94-101 . 本站查看
3. 杨小平,徐辉,彭碧强,刘锐,林阳升,陈梦圆,鱼阳阳,李嘉浩. 钻井液智能调度与可视化平台建设. 石油工业技术监督. 2023(11): 12-17 . 百度学术
4. 李在友,孙艳斌,王晓光,陈永,刘光伟,郭直清. 基于改进被囊群算法的露天矿无人驾驶卡车运输调度. 工矿自动化. 2022(06): 87-94+127 . 本站查看
5. 彭程,郭春雨. 定子电流建模的差分进化与最小二乘混合算法. 华北科技学院学报. 2020(02): 112-115 . 百度学术
6. 李刚,李伟,李亚娥. 露天矿皮带运输设备加强设备管理与维修的探讨. 中国金属通报. 2020(05): 55-56 . 百度学术
7. 门飞,蒋欣. 求解露天矿低碳运输调度问题的改进灰狼优化算法. 工矿自动化. 2020(12): 90-94 . 本站查看
8. 李振鹏. 露天矿运输机械设备维护检修途径. 建材与装饰. 2019(16): 230-231 . 百度学术
9. 赵文瑜,彭程. 人工势场法路径规划的差分进化参数优化研究. 现代计算机. 2019(31): 3-5+20 . 百度学术
10. 王俊栋,李宁,吴亚辉,卢文杰,王李管,李江江. 基于改进DCW-QPSO算法的露天矿卡车调度优化方法. 金属矿山. 2019(12): 156-162 . 百度学术
其他类型引用(6)