用于井下行人检测的可见光和红外图像融合算法

周李兵; 陈晓晶; 贾文琪; 卫健健; 叶柏松; 邹盛

doi:10.13272/j.issn.1671-251x.2023070025

摘要: 矿用智能车辆的工作环境光照条件复杂，在进行井下行人检测时可以通过融合可见光和红外图像，将红外线反射信息和细节纹理信息融合于可见光图像中，改善目标检测效果。传统的可见光和红外图像融合方法随着分解层数增多，会导致图像边缘和纹理模糊，同时融合时间也会增加。目前基于深度学习的可见光和红外图像融合方法难以平衡可见光和红外图像中的特征，导致融合图像中细节信息模糊。针对上述问题，提出了一种基于多注意力机制的可见光和红外图像融合算法（IFAM）。首先采用卷积神经网络对可见光和红外图像提取图像特征；然后通过空间注意力和通道注意力模块分别对提取出来的特征进行交叉融合，同时利用特征中梯度信息计算2个注意力模块输出特征的融合权值，根据权值融合2个注意力模块的输出特征；最后通过反卷积变换对图像特征进行还原，得到最终的融合图像。在RoadScene数据集和TNO数据集上的融合结果表明，经IFAM融合后的图像中同时具备了可见光图像中的背景纹理和红外图像中的行人轮廓特征信息；在井下数据集上的融合结果表明，在弱光环境下，红外图像可以弥补可见光的缺点，并且不受环境中其他光源的影响，在弱光条件下融合后的图像中行人轮廓依旧明显。对比分析结果表明，经IFAM融合后图像的信息熵（EN）、标准方差（SD）、梯度融合度量指标（Q^AB/F）、融合视觉信息保真度（VIFF）和联合结构相似性度量（SSIM_u）分别为4.901 3，88.521 4，0.169 3，1.413 5，0.806 2，整体性能优于同类的LLF−IOI，NDM等算法。

Abstract: The working environment and lighting conditions of mining intelligent vehicles are complex. When detecting underground personnel, infrared reflection information and detailed texture information can be fused into visible light images by fusing visible and infrared images to improve the target detection effect. Traditional visible and infrared image fusion methods can lead to blurring of image edges and textures as the number of decomposition layers increases, and the fusion time also increases. At present, deep learning based fusion methods for visible and infrared images are difficult to balance the features in visible and infrared images, resulting in blurred detail information in the fused images. In order to solve the above problems, the image fusion algorithm based on multiple attention modules (IFAM) is proposed. Firstly, convolutional neural networks are used to extract image features from visible and infrared images. Secondly, the extracted features are cross fused using spatial attention and channel attention modules. The fusion weights of the output features of the two attention modules are calculated using the gradient information in the features. The output features of the two attention modules are fused based on the weights. Finally, the image features are restored through deconvolution transformation to obtain the final fused image. The fusion results on the RoadScene dataset and TNO dataset indicate that the IFAM fused image contains both background texture information from visible light images and personnel contour feature information from infrared images. The fusion results on the underground dataset indicate that in low lighting environments, infrared images can compensate for the shortcomings of visible light and are not affected by other light sources in the environment. In low lighting conditions, the personnel contour in the fused image is still obvious. The comparative analysis results show that the information entropy (EN), standard deviation (SD), gradient fusion metric (Q^AB/F), visual information fidelity for fusion (VIFF), and the union structural similarity index measure (SSIM_u) of the image after IFAM fusion are 4.901 3, 88.521 4, 0.169 3, 1.413 5, and 0.806 2, respectively. The overall performance is superior to similar algorithms such as LLF-IOI and NDM.

用于井下行人检测的可见光和红外图像融合算法

Visible and infrared image fusion algorithm for underground personnel detection