基于多尺度特征融合和注意力机制的小目标检测

方岩; 袁国宏; 孙正宝; 岳昆

doi:10.7540/j.ynu.20240327

基于多尺度特征融合和注意力机制的小目标检测

Small object detection by multi-scale feature fusion and attention mechanism

摘要

摘要: 针对目标检测中存在的小目标特征信息提取不足、背景噪声干扰和定位困难导致漏检的问题，提出基于多尺度特征融合和注意力机制的小目标检测模型YOLOv7-BAMFF. 首先，将包含丰富语义信息的Conv2层加入特征融合过程，提取更细粒度的低层特征，并在多尺度特征融合过程中进行跨尺度的跳跃连接和上下文信息自适应加权融合；然后，在特征重提取和优化过程中引入改进的协同注意力机制，抑制复杂背景噪声干扰、增强对小目标的关注；最后，通过优化模型的定位损失函数以提高对小目标的定位精度、并增加小目标检测头，从而提升小目标检测能力. 在PASCAL VOC和VisDrone2019数据集上的实验结果表明，提出方法的平均检测精度分别从基线方法YOLOv7的82.1%和43.8%提升至85.4%和50.4%，且优于现有主流检测方法.

Abstract: To address the challenges of insufficient feature extraction for small objects, interference from background noise, and difficulties in precise localization within object detection, this paper proposes the YOLOv7-BAMFF model, lever-aging the multiscale feature fusion and attention mechanism. First, by incorporating the semantically rich Conv2 layer, we extract finer-grained features from lower layers, and conduct multiscale feature fusion involving cross-scale skip connections and adaptive contextual information fusion. Then, during the process of feature re-extraction and optimization, we introduce the enhanced coordinate attention mechanism to suppress complex background noise and accentuate small objects. Finally, we optimize the localization loss function to improve the precision, while add the small object detection head to improve the capacity of detection. Experimental results on the PASCAL VOC and VisDrone2019 datasets demonstrate that our approach achieves an average detection accuracy improvement from the baseline method YOLOv7 of 82.1% and 43.8% to 85.4% and 50.4%, respectively, which outperforms other main-stream methods.

HTML全文

参考文献(22)

施引文献

资源附件(0)