欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture ›› 2025, Vol. 7 ›› Issue (3): 108-119.doi: 10.12133/j.smartag.SA202410010

• 信息处理与决策 • 上一篇    

基于YOLOv10-MHSA的“三北”工程内蒙古地区植树位点精准检测研究

谢纪元1,2, 张东彦1,2(), 牛圳1,2, 程涛1,2, 苑峰3, 刘亚玲3   

  1. 1. 西北农林科技大学 机械与电子工程学院,陕西杨凌 712100,中国
    2. 陕西省农业信息感知与智能服务重点实验室,陕西杨凌 712100,中国
    3. 国家草业技术创新中心(筹),内蒙古 呼和浩特 010021,中国
  • 收稿日期:2024-10-16 出版日期:2025-05-30
  • 基金项目:
    内蒙古自治区科技计划项目(2023JBGS000804); 呼和浩特市科技创新领域人才项目(2023RC-高层次-7); 呼和浩特市基础研究与应用基础研究项目(2024-规-基-34)
  • 作者简介:

    谢纪元,博士研究生,研究方向为无人机遥感。E-mail:

  • 通信作者:
    张东彦,博士,教授,研究方向为智慧林业与草业。E-mail:

Accurate Detection of Tree Planting Locations in Inner Mongolia for The Three North Project Based on YOLOv10-MHSA

XIE Jiyuan1,2, ZHANG Dongyan1,2(), NIU Zhen1,2, CHENG Tao1,2, YUAN Feng3, LIU Yaling3   

  1. 1. College of Mechanical and Electronic Engineering, Northwest A&F University, Yangling 712100, China
    2. Shaanxi Key Laboratory of Agriculture Information Perception and Intelligent Service, Yangling 712100, China
    3. National Grass Technology Innovation Center (Preparation), Hohhot 010021, China
  • Received:2024-10-16 Online:2025-05-30
  • Foundation items:Science and Technology Program of Inner Mongolia Autonomous Region(2023JBGS000804); Hohhot Science and Technology Innovation Field Talent Program(2023RC-High Level-7); Hohhot Basic Research and Applied Basic Research Program(2024-Gauge-Base-34)
  • About author:

    XIE Jiyuan, E-mail:

  • Corresponding author:
    ZHANG Dongyan, E-mail:

摘要:

【目的/意义】 为解决无人机平台下“三北”工程内蒙古地区植树位点(树坑)受复杂背景(灌木、杂草群、裸露沙土、起伏地形等)影响,容易出现树坑漏检错检问题,构建了一种针对该场景下的小目标检测模型——YOLOv10-MHSA(You Only Look Once version 10-Multi-head Self-Attention)。 【方法】 以YOLOv10为基准模型,采用分层特征增强策略,通过跨层信息补偿提升小目标语义表征的完整性,提高其对小目标特征描述的准确性;引入可变卷积核AKConv(Adaptive Kernel Convolution),使模型更精确地聚焦输入图像的特征;构建融合特征的多头自注意力机制MHSA以实现考虑复杂环境因素的有效特征获取;引入Focal-EIOU Loss(Focal Efficient Intersection over Union Loss)替代原有CIOU Loss(Complete Intersection over Union Loss)作为边界框的回归损失,构建非线性优化策略,在保证训练稳定性的同时实现边界框参数的精确计算;最后,选择影响精准识别效果最大的两个因素,通过设计多尺度空间分布与光照强度梯度变化的对比实验,系统性验证了模型在复杂场景下的泛化性与鲁棒性。 【结果和讨论】 提出的模型YOLOv10-MHSA在实验数据集上的平均识别精度和检测准确率分别达96.1%和92.1%,相比原模型分别提高4.1%和5.1%,可满足无人机对“三北”工程内蒙古地区植树位点(树坑)进行实时识别的精度和速度要求。 【结论】 YOLOv10-MHSA模型通过引入动态特征增强模块,在维持原有检测效率的基础上,成功解决了复杂场景中植树位点小目标特征易湮没的检测瓶颈,这为无人机平台下“三北”工程内蒙古地区植树位点的遥感精准、快速检测提供了新方法。

关键词: 植树位点, 复杂背景, 无人机, 小目标检测, YOLOv10

Abstract:

[Objective] The traditional manual field investigation method of the tree planting locations is not only inefficient but also error-prone, and the low-altitude unmanned aerial vehicle (UAV) has become the best choice to solve these problems. To solve the problem of accuracy and efficiency in the detection of tree planting locations (tree pits) in Inner Mongolia of China's Three North Project, an accurate recognition and detection model of tree planting locations based on YOLOv10-MHSA was proposed. [Methods] A long-endurance, multi-purpose vertical take-off and landing (VTOL) fixed-wing UAV was used to collect images of tree planting locations. Equipped with a 26-megapixel camera with high spatial resolution, the UAV was well-suited for high-precision field mapping. Aerial photography was conducted between 11:00 and 12:00 on August 1, 2024. Flight parameters were set as follows: Altitude of 150 m (yielding a ground resolution of approximately 2.56 cm), course overlap rate of 75%, side overlap rate of 65%, and flight speed of 20 m/s. To prevent overfitting during network training, the original data set was enhanced. To improve the quality and efficiency of model training, different attention mechanisms and optimizing loss functions were introduced. Specifically, a more effective EIOU loss function was introduced, comprising three components: IOU loss, distance loss, and azimuth loss. This function directly minimizes the width and height discrepancies between the target frame and anchor, leading to faster convergence and more accurate positioning. Additionally, the Focal-EIOU loss function was adopted to address sample imbalance in bounding box regression tasks, further improving the model's convergence speed and positioning precision. [Results and Discussions] After the introduction of the multi-head self-attention mechanism (MHSA), the model achieved improvements of 1.4% and 1.7% in the evaluation metrics AP@0.5 and AP@0.5:0.95, respectively, and the accuracy and recall rate were also improved. This indicates that MHSA effectively aids the model in extracting the feature information of the target and improving the detection accuracy in complex background. Although the processing speed of the model decreases slightly after adding the attention mechanism, it could still meet the requirements of real-time detection. The experiment compared four loss functions: CIOU, SIOU, EIOU and Focal-EIOU. The results showed that the Focal-EIOU loss function yielded significant increases in precision and recall. This demonstrated that the Focal-EIOU loss function could accelerate the convergence speed of the model and improve the positioning accuracy when dealing with the sample imbalance problem in small target detection. Finally, an improved model, YOLOv10-MHSA, was proposed, incorporating MHSA attention mechanism, small target detection layer and Focal-EIOU loss function. The results of ablation experiments showed that AP@0.5 and AP@0.5:0.95 were increased by 2.2% and 0.9%, respectively, after adding only small target detection layer on the basis of YOLOv10n, and the accuracy and recall rate were also significantly improved. When the MHSA and Focal-EIOU loss functions were further added, the model detection effect was significantly improved. Compared with the baseline model YOLOv10n, the AP@0.5, AP@0.5:0.95, P-value and R-value were improved by 6.6%, 10.0%, 4.1% and 5.1%, respectively. Although the FPS was reduced, the detection performance of the improved model was significantly better than that of the original model in various complex scenes, especially for small target detection in densely distributed and occluded scenes. [Conclusions] By introducing MHSA and the optimized loss function (Focal-EIOU) into YOLOv10n model, the research significantly improved the accuracy and efficiency of tree planting location detection in the Three North Project in Inner Mongolia. The experimental results show that MHSA can enhance the ability of the model to extract local and global information of the target in complex background, and effectively reduce the phenomenon of missed detection and false detection. The Focal-EIOU loss function accelerates the convergence speed of the model and improves the positioning accuracy by optimizing the sample imbalance problem in the bounding box regression task. Although the model processing speed has decreased, the method proposed still meets the real-time detection requirements, provides strong technical support for the scientific afforestation of the Three North Project.

Key words: tree planting locations, complex background, unmanned aerial vehicle, small target detection, YOLOv10

中图分类号: