留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

多特征融合的煤矿网络加密恶意流量检测方法

霍跃华 赵法起 吴文昊

霍跃华,赵法起,吴文昊. 多特征融合的煤矿网络加密恶意流量检测方法[J]. 工矿自动化,2022,48(7):142-148.  doi: 10.13272/j.issn.1671-251x.17944
引用本文: 霍跃华,赵法起,吴文昊. 多特征融合的煤矿网络加密恶意流量检测方法[J]. 工矿自动化,2022,48(7):142-148.  doi: 10.13272/j.issn.1671-251x.17944
HUO Yuehua, ZHAO Faqi, WU Wenhao. Multi-feature fusion based encrypted malicious traffic detection method for coal mine network[J]. Journal of Mine Automation,2022,48(7):142-148.  doi: 10.13272/j.issn.1671-251x.17944
Citation: HUO Yuehua, ZHAO Faqi, WU Wenhao. Multi-feature fusion based encrypted malicious traffic detection method for coal mine network[J]. Journal of Mine Automation,2022,48(7):142-148.  doi: 10.13272/j.issn.1671-251x.17944

多特征融合的煤矿网络加密恶意流量检测方法

doi: 10.13272/j.issn.1671-251x.17944
基金项目: 国家重点研发计划项目(2016YFC0801800)。
详细信息
    作者简介:

    霍跃华(1981— ),男,山西晋中人,高级工程师,硕士研究生导师,主要研究方向为网络安全、通信与监测,E-mail: huoyh@cumtb.edu.cn

  • 中图分类号: TD67

Multi-feature fusion based encrypted malicious traffic detection method for coal mine network

  • 摘要: 针对煤矿网络面临由恶意软件所产生的安全传输层协议(TLS)加密恶意流量威胁和检测过程加密流量误报率高的问题,提出了一种基于多特征融合的煤矿网络TLS加密恶意流量检测方法。分析了TLS加密恶意流量特征多元异构的特点,提取出煤矿网络TLS加密恶意流在传输过程中的连接特征、元数据和TLS加密协议握手特征,利用流指纹方法构造煤矿网络TLS加密流量特征集,并对该特征集中的特征进行标准化、独热编码和规约处理,从而得到一个高效样本集。采用决策树(DT)、K近邻(KNN)、高斯朴素贝叶斯(GNB)、L2逻辑回归(LR)和随机梯度下降(SGD)分类器5个子模型对上述特征集进行检验。为提高检测模型的鲁棒性,结合投票法原理将5个分类器子模型结合,构建了多模型投票(MVC)检测模型:将5个分类器子模型作为投票器,每个分类器子模型单独训练样本集,按照少数服从多数原则进行投票,得到每个样本的最终预测值。实验验证结果表明:所构建的特征集降低了样本集维度,提高了TLS加密流量检测效率。DT分类器和KNN分类器在数据集上表现最好,达到了99%以上的准确率,但是它们存在过拟合风险;LR分类器和SGD分类器子模型虽然也达到了90%以上的识别准确率,但这2个子模型的误报率过高;GNB分类器子模型表现最差,准确率只有82%,但该子模型具有误报率低的优势。MVC检测模型在数据集上准确率和召回率达99%以上,误报率为0.13%,提高了加密恶意流量的检出率,加密流量检测误报率为0,其综合性能优于其他分类器子模型。

     

  • 图  1  TLS握手过程

    Figure  1.  TLS handshake process

    图  2  基于多特征融合的TLS加密恶意流量检测方法流程

    Figure  2.  Flow of TLS encrypted malicious traffic detected method based on multi-feature fusion

    图  3  pcap流量包处理流程

    Figure  3.  Flow of pcap traffic packages processing

    图  4  模型错误分类TLS加密样本数量

    Figure  4.  Number of TLS encrypted samples misclassified by models

    表  1  εi≥0.01的前28个特征和特征重要性权重

    Table  1.   Top 28 features with εi≥0.01 and feature importance weights

    特征εi特征εi
    后向数据包负载最大值0.092 9后向数据包头字节数值0.016 5
    conn_state
    连接状态
    0.092 3前向第1个数据包的窗口大小(字节)0.016 2
    前向数据包负载标准差0.071 9有1个有效载荷的后向数据包数量0.015 5
    后向数据包负载平均值0.061 6流中2个连续数据包之间到达时间的最小值0.014 3
    后向数据包负载最小值0.058 6前向数据包负载最大值0.013 9
    流数据包负载平均值0.048 6前向所有到达时间的标准差0.013 4
    目的端口号0.042 6流中2个连续数据包之间到达时间的最大值0.012 8
    流数据包负载最大值0.034 71个TCP流中出现ACK标志的总数0.012 8
    后向数据包头字节数最大值0.027 5前向2个连续数据包之间到达时间的最大值0.011 3
    前向子流有效负载平均数量0.023 6前向子流数据包数量平均值0.011 2
    后向2个连续数据包之间到达时间的方差0.023 1后向数据包头字节综述0.0110
    后向与前向数据包数量之比0.022 3后向第1个数据包的窗口大小(字节)0.010 5
    后向2个连续数据包之间到达时间的最大值0.021 8后向2个连续数据包之间到达时间的平均值0.010 2
    后向子流数据包数量平均值0.016 5源端口号0.010 0
    下载: 导出CSV

    表  2  流量数据集

    Table  2.   Traffic dataset

    流量名称类型样本量/条总量/条



    恶意流量
    Yakes209 611


    657 198
    Conficker80 751
    Cridex90 130
    Dridex41 622
    Sality146 366
    Razy68 047
    TrickBot20 671
    良性流量Normal314 733314 733
    下载: 导出CSV

    表  3  模型性能对比

    Table  3.   Comparison of the performance of models

    模型A/%R/%F1/%W/%
    DT99.8899.8599.840.09
    KNN99.8899.8699.830.10
    GNB82.9351.8768.170.17
    LR97.6999.2196.813.13
    SGD94.3598.7392.498.00
    MVC99.6699.2899.520.13
    下载: 导出CSV
  • [1] 刘雨燕,宋燕. 新一代信息技术助力智慧矿山建设[J]. 煤炭技术,2021,40(2):184-186.

    LIU Yuyan,SONG Yan. New-generation information technology helps construction of smart mines[J]. Coal Technology,2021,40(2):184-186.
    [2] 陈燕. 煤矿网络安全风险与防范标准研究[J]. 中国石油和化工标准与质量,2019,39(18):5-6. doi: 10.3969/j.issn.1673-4076.2019.18.002

    CHEN Yan. Study on safety risk and prevention standard of coal mine network[J]. China Petroleum and Chemical Standard and Quality,2019,39(18):5-6. doi: 10.3969/j.issn.1673-4076.2019.18.002
    [3] 谭靓洁,李永飞,吴琼. 基于区块链的煤矿安监云数据安全访问模型研究[J]. 工矿自动化,2022,48(5):93-99. doi: 10.13272/j.issn.1671-251x.2022030023

    TAN Liangjie,LI Yongfei,WU Qiong. Research on security access model of coal mine safety supervision cloud data based on blockchain[J]. Journal of Mine Automation,2022,48(5):93-99. doi: 10.13272/j.issn.1671-251x.2022030023
    [4] SEAN G. Nearly half of malware now use TLS to conceal communications[EB/OL]. [2022-03-21]. https://news.sophos.com/en-us/2021/04/21/nearly-half-of-malware-now-use-tls-to-conceal-communications/.
    [5] 袁钦献. 加密网络流量分析关键技术研究与开发[D]. 西安: 西安电子科技大学, 2019.

    YUAN Qinxian. Research and development of key technology for encrypted network traffic analysis[D]. Xi'an: Xidian University, 2019.
    [6] ANDERSON B, MCGREW D. Identifying encrypted malware traffic with contextual flow data[C]//Proceedings of the 2016 ACM workshop on artificial intelligence and security, Vienna, 2016: 35-46.
    [7] 翟明芳,张兴明,赵博. 基于深度学习的加密恶意流量检测研究[J]. 网络与信息安全学报,2020,6(3):66-77.

    ZHAI Mingfang,ZHANG Xingming,ZHAO Bo. Survey of encrypted malicious traffic detection based on deep learning[J]. Chinese Journal of Network and Information Security,2020,6(3):66-77.
    [8] TORROLEDO I, CAMACHO L D, BAHNSEN A C. Hunting malicious TLS certificates with deep neural networks[C]//Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security, Toronto, 2018: 64-73.
    [9] YU Tangda, ZOU Futai, LI Linsen, et al. An encrypted malicious traffic detection system based on neural network[C]//2019 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery(CyberC), Guilin, 2019: 62-70.
    [10] REZAEI S,LIU X. Deep learning for encrypted traffic classification:an overview[J]. IEEE Communications Magazine,2019,57(5):76-81. doi: 10.1109/MCOM.2019.1800819
    [11] ANDERSON B,PAUL S,MCGREW D. Deciphering malware's use of TLS (without decryption)[J]. Journal of Computer Virology and Hacking Techniques,2016,14(1):1-17.
    [12] ANDERSON B, MCGREW D. Machine learning for encrypted malware traffic classification: accounting for noisy labels and non-ntationarity[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 2017: 1723-1732.
    [13] 骆子铭,许书彬,刘晓东. 基于机器学习的TLS恶意加密流量检测方案[J]. 网络与信息安全学报,2020,6(1):77-83.

    LUO Ziming,XU Shubin,LIU Xiaodong. Scheme for identifying malware traffic with TLS data based on machine learning[J]. Chinese Journal of Network and Information,2020,6(1):77-83.
    [14] BARUT O, ZHU R, LUO Y, et al. TLS encrypted application classification using machine learning with flow feature engineering[C]//The 10th International Conference on Communication and Network Security, Tokyo, 2020: 32-41.
    [15] 鲁刚,郭荣华,周颖,等. 恶意流量特征提取综述[J]. 信息网络安全,2018(9):1-7.

    LU Gang,GUO Ronghua,ZHOU Ying,et al. Review of malicious traffic feature extraction[J]. Netinfo Security,2018(9):1-7.
    [16] 康鹏, 杨文忠, 马红桥. TLS协议恶意加密流量识别研究综述[J/OL]. 计算机工程与应用: 1-21[2022-03-21]. http://kns.cnki.net/kcms/detail/11.2127.TP.20220308.0853.002.html.

    KANG Peng, YANG Wenzhong, MA Hongqiao. TLS malicious encrypted traffic identification research [J/OL]. Computer Engineering and Applications: 1-21[2022-03-21]. http://kns.cnki.net/kcms/detail/11.2127.TP.20220308.0853.002.html.
    [17] 王洋,陈紫儿,柳瑞春,等. 基于决策树算法的网络加密流量识别方法[J]. 长江信息通信,2021,34(11):15-17. doi: 10.3969/j.issn.1673-1131.2021.11.005

    WANG Yang,CHEN Zi'er,LIU Ruichun,et al. Network encryption traffic identification method based on decision tree algorithm[J]. Changjiang Information & Communications,2021,34(11):15-17. doi: 10.3969/j.issn.1673-1131.2021.11.005
    [18] 张心语,张秉晟,孟泉润,等. 隐私保护的加密流量检测研究[J]. 网络与信息安全学报,2021,7(4):101-113.

    ZHANG Xinyu,ZHANG Bingsheng,MENG Quanrun,et al. Study on privacy preserving encrypted traffic detection[J]. Chinese Journal of Network and Information,2021,7(4):101-113.
    [19] PEDREGOSA F,VAROQUAUX G,GRAMFORT A,et al. Scikit-learn:machine learning in Python[J]. Machine Learning,2011,12:2825-2830.
    [20] GARCIA S,GRILL M,STIBOREK J,et al. An empirical comparison of botnet detection methods[J]. Computers & Security,2014,45:100-123.
  • 加载中
图(4) / 表(3)
计量
  • 文章访问数:  199
  • HTML全文浏览量:  29
  • PDF下载量:  16
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-05-08
  • 修回日期:  2022-07-12
  • 网络出版日期:  2022-07-12

目录

    /

    返回文章
    返回