基于空间金字塔池化的深度卷积神经网络多聚焦图像融合

梅礼晔; 郭晓鹏; 张俊华; 郭正红; 肖佳

doi:10.7540/j.ynu.20170670

基于空间金字塔池化的深度卷积神经网络多聚焦图像融合

Spatial Pyramid Pooling in deep convolutional networks for multi - focus image fusion

摘要

摘要: 针对传统方法需要人工设定特征和融合准则来完成融合任务，未能充分利用源图像中其他潜在有用信息的缺陷，提出一种基于空间金字塔池化网络的深度学习方法. 首先，设计了一种孪生双通道卷积神经网络，并使用金字塔池化代替最大池化，学习多聚焦图像的特征. 然后，为了有效训练该网络，采用高斯滤波器合成一个大规模具有金标准的多聚焦数据集. 给定一幅多聚焦图像作为输入，训练好的模型可以输出一个指示源图像中聚焦性质的得分图. 此外，为了进一步提高融合效果，将得分图进一步分割为二值掩模图，并使用形态学方法对其进行优化. 最后，通过在优化的二值掩模图及源图像之间使用点乘运算, 将可以得到最终融合图像. 实验结果表明，算法在测试集上平均量化指标提高了0.78%.

Abstract: Aiming at the defect of making use of hand-crafted features and fusion criterions to fulfill the fusion task by traditional methods, which does not consider efficiently other potentially useful information in source images, we propose a deep learning method based on the spatial pyramid pooling (SPP). First, we design a Siamese network and replace the average pooling with SPP to learn the features of multi-focus images. Then, to train the network effectively, we synthesize a large-scale multi-focus image dataset with ground truth through a Gaussian filter. Given a pair of multi-focus image as input, the trained model can generate a score map indicating the focus property of source images. Moreover, to further enhance the fusion effects, we segment the score map into a binary mask image, which is refined using morphological technique. Finally, the fused image is gained by employing dot multiplication operation between source images and the refined binary mask image. Experimental results reveal that the average quantitative score on test images achieved by the proposed method is increased by 0.78%.

HTML全文

参考文献(21)

施引文献

资源附件(0)