基于注意力机制与浅层特征融合的小目标检测算法

Small target detection algorithm based on attention mechanism and shallow feature fusion

  • 摘要: 为解决无人机高空视角下小目标检测所面临的漏检、误检及复杂背景干扰等问题,提出一种基于注意力机制与浅层特征融合的小目标检测模型CSD-YOLO. 首先,在原始骨干网络利用优化的C3K2模块加强多尺度特征提取和融合,并设计AS模块丰富梯度流信息,从而提高对多尺度目标的检测效果. 其次,对颈部网络进行结构重构,引入浅层特征融合模块并重新设计其尾端,实现首尾跨尺度特征校准,在强化对底层特征图的关注,补偿小目标在深层传播过程中的特征丢失的同时,保障被遮挡目标残余空间信息的完整性. 最后,在检测输出端嵌入尺度、空间与任务感知机制,通过多维度动态调整检测策略,显著增强模型对目标形变和尺度变化的自适应能力. 实验在VisDrone2019与TinyPerson两个公开数据集上开展,结果表明,CSD-YOLO分别取得41.8%和29.8%的mAP50指标,较YOLOv11模型提升9.4和2.2个百分点,且整体模型复杂度低于当前主流检测算法.

     

    Abstract: To address the problems of missed detection, false detection and complex background interference faced by small target detection in the high-altitude perspective of unmanned aerial vehicles (UAVs), a small target detection model CSD-YOLO based on the fusion of attention mechanism and shallow features is proposed. Firstly, in the original backbone network, the optimized C3K2 module is utilized to enhance multi-scale feature extraction and fusion, and the AS module is designed to enrich the gradient flow information, thereby improving the detection effect of multi-scale targets. Secondly, the structure of the neck network is reconstructed. A shallow feature fusion module is introduced and its tail end is redesigned to achieve cross-scale feature calibration at the head and tail. While strengthening the attention to the underlying feature map and compensating for the feature loss of small targets during the deep propagation process, the integrity of the residual spatial information of the occluded targets is guaranteed. Finally, scale, space and task perception mechanisms are embedded at the detection output end. By dynamically adjusting the detection strategy from multiple dimensions, the model's adaptive ability to target deformation and scale changes is significantly enhanced. The experiments were conducted on two public datasets, VisDrone2019 and TinyPerson. The results show that CSD-YOLO achieved mAP50 metrics of 41.8% and 29.8% respectively, which are 9.4 and 2.2 percentage points higher than those of the YOLOv11 model, and the overall model complexity is lower than that of the current mainstream detection algorithms.

     

/

返回文章
返回