基于YOLOv10-MHSA的“三北”工程内蒙古地区植树位点精准检测研究

doi:10.12133/j.smartag.SA202410010

Smart Agriculture ›› 2025, Vol. 7 ›› Issue (3): 108-119.doi: 10.12133/j.smartag.SA202410010

• 信息处理与决策 • 上一篇

基于YOLOv10-MHSA的“三北”工程内蒙古地区植树位点精准检测研究

谢纪元¹^,², 张东彦¹^,²(), 牛圳¹^,², 程涛¹^,², 苑峰³, 刘亚玲³

^1. 西北农林科技大学机械与电子工程学院，陕西杨凌 712100，中国
^2. 陕西省农业信息感知与智能服务重点实验室，陕西杨凌 712100，中国
^3. 国家草业技术创新中心（筹），内蒙古呼和浩特 010021，中国

收稿日期:2024-10-16 出版日期:2025-05-30
基金项目:
内蒙古自治区科技计划项目(2023JBGS000804); 呼和浩特市科技创新领域人才项目（2023RC-高层次-7）; 呼和浩特市基础研究与应用基础研究项目（2024-规-基-34）
作者简介:
谢纪元，博士研究生，研究方向为无人机遥感。E-mail：JiyuanXie01@163.com
通信作者:
张东彦，博士，教授，研究方向为智慧林业与草业。E-mail：hello-lion@hotmail.com

Accurate Detection of Tree Planting Locations in Inner Mongolia for The Three North Project Based on YOLOv10-MHSA

XIE Jiyuan¹^,², ZHANG Dongyan¹^,²(), NIU Zhen¹^,², CHENG Tao¹^,², YUAN Feng³, LIU Yaling³

^1. College of Mechanical and Electronic Engineering, Northwest A&F University, Yangling 712100, China
^2. Shaanxi Key Laboratory of Agriculture Information Perception and Intelligent Service, Yangling 712100, China
^3. National Grass Technology Innovation Center (Preparation), Hohhot 010021, China

Received:2024-10-16 Online:2025-05-30
Foundation items:Science and Technology Program of Inner Mongolia Autonomous Region(2023JBGS000804); Hohhot Science and Technology Innovation Field Talent Program(2023RC-High Level-7); Hohhot Basic Research and Applied Basic Research Program(2024-Gauge-Base-34)
About author:
XIE Jiyuan, E-mail: JiyuanXie01@163.com
Corresponding author:
ZHANG Dongyan, E-mail: hello-lion@hotmail.com

摘要/Abstract

摘要：

【目的/意义】 为解决无人机平台下“三北”工程内蒙古地区植树位点（树坑）受复杂背景（灌木、杂草群、裸露沙土、起伏地形等）影响，容易出现树坑漏检错检问题，构建了一种针对该场景下的小目标检测模型——YOLOv10-MHSA（You Only Look Once version 10-Multi-head Self-Attention）。 【方法】 以YOLOv10为基准模型，采用分层特征增强策略，通过跨层信息补偿提升小目标语义表征的完整性，提高其对小目标特征描述的准确性；引入可变卷积核AKConv（Adaptive Kernel Convolution），使模型更精确地聚焦输入图像的特征；构建融合特征的多头自注意力机制MHSA以实现考虑复杂环境因素的有效特征获取；引入Focal-EIOU Loss（Focal Efficient Intersection over Union Loss）替代原有CIOU Loss（Complete Intersection over Union Loss）作为边界框的回归损失，构建非线性优化策略，在保证训练稳定性的同时实现边界框参数的精确计算；最后，选择影响精准识别效果最大的两个因素，通过设计多尺度空间分布与光照强度梯度变化的对比实验，系统性验证了模型在复杂场景下的泛化性与鲁棒性。 【结果和讨论】 提出的模型YOLOv10-MHSA在实验数据集上的平均识别精度和检测准确率分别达96.1%和92.1%，相比原模型分别提高4.1%和5.1%，可满足无人机对“三北”工程内蒙古地区植树位点（树坑）进行实时识别的精度和速度要求。 【结论】 YOLOv10-MHSA模型通过引入动态特征增强模块，在维持原有检测效率的基础上，成功解决了复杂场景中植树位点小目标特征易湮没的检测瓶颈，这为无人机平台下“三北”工程内蒙古地区植树位点的遥感精准、快速检测提供了新方法。

关键词: 植树位点, 复杂背景, 无人机, 小目标检测, YOLOv10

Abstract:

[Objective] The traditional manual field investigation method of the tree planting locations is not only inefficient but also error-prone, and the low-altitude unmanned aerial vehicle (UAV) has become the best choice to solve these problems. To solve the problem of accuracy and efficiency in the detection of tree planting locations (tree pits) in Inner Mongolia of China's Three North Project, an accurate recognition and detection model of tree planting locations based on YOLOv10-MHSA was proposed. [Methods] A long-endurance, multi-purpose vertical take-off and landing (VTOL) fixed-wing UAV was used to collect images of tree planting locations. Equipped with a 26-megapixel camera with high spatial resolution, the UAV was well-suited for high-precision field mapping. Aerial photography was conducted between 11:00 and 12:00 on August 1, 2024. Flight parameters were set as follows: Altitude of 150 m (yielding a ground resolution of approximately 2.56 cm), course overlap rate of 75%, side overlap rate of 65%, and flight speed of 20 m/s. To prevent overfitting during network training, the original data set was enhanced. To improve the quality and efficiency of model training, different attention mechanisms and optimizing loss functions were introduced. Specifically, a more effective EIOU loss function was introduced, comprising three components: IOU loss, distance loss, and azimuth loss. This function directly minimizes the width and height discrepancies between the target frame and anchor, leading to faster convergence and more accurate positioning. Additionally, the Focal-EIOU loss function was adopted to address sample imbalance in bounding box regression tasks, further improving the model's convergence speed and positioning precision. [Results and Discussions] After the introduction of the multi-head self-attention mechanism (MHSA), the model achieved improvements of 1.4% and 1.7% in the evaluation metrics AP@0.5 and AP@0.5:0.95, respectively, and the accuracy and recall rate were also improved. This indicates that MHSA effectively aids the model in extracting the feature information of the target and improving the detection accuracy in complex background. Although the processing speed of the model decreases slightly after adding the attention mechanism, it could still meet the requirements of real-time detection. The experiment compared four loss functions: CIOU, SIOU, EIOU and Focal-EIOU. The results showed that the Focal-EIOU loss function yielded significant increases in precision and recall. This demonstrated that the Focal-EIOU loss function could accelerate the convergence speed of the model and improve the positioning accuracy when dealing with the sample imbalance problem in small target detection. Finally, an improved model, YOLOv10-MHSA, was proposed, incorporating MHSA attention mechanism, small target detection layer and Focal-EIOU loss function. The results of ablation experiments showed that AP@0.5 and AP@0.5:0.95 were increased by 2.2% and 0.9%, respectively, after adding only small target detection layer on the basis of YOLOv10n, and the accuracy and recall rate were also significantly improved. When the MHSA and Focal-EIOU loss functions were further added, the model detection effect was significantly improved. Compared with the baseline model YOLOv10n, the AP@0.5, AP@0.5:0.95, P-value and R-value were improved by 6.6%, 10.0%, 4.1% and 5.1%, respectively. Although the FPS was reduced, the detection performance of the improved model was significantly better than that of the original model in various complex scenes, especially for small target detection in densely distributed and occluded scenes. [Conclusions] By introducing MHSA and the optimized loss function (Focal-EIOU) into YOLOv10n model, the research significantly improved the accuracy and efficiency of tree planting location detection in the Three North Project in Inner Mongolia. The experimental results show that MHSA can enhance the ability of the model to extract local and global information of the target in complex background, and effectively reduce the phenomenon of missed detection and false detection. The Focal-EIOU loss function accelerates the convergence speed of the model and improves the positioning accuracy by optimizing the sample imbalance problem in the bounding box regression task. Although the model processing speed has decreased, the method proposed still meets the real-time detection requirements, provides strong technical support for the scientific afforestation of the Three North Project.

Key words: tree planting locations, complex background, unmanned aerial vehicle, small target detection, YOLOv10

中图分类号:

谢纪元, 张东彦, 牛圳, 程涛, 苑峰, 刘亚玲. 基于YOLOv10-MHSA的“三北”工程内蒙古地区植树位点精准检测研究[J]. 智慧农业(中英文), 2025, 7(3): 108-119.

XIE Jiyuan, ZHANG Dongyan, NIU Zhen, CHENG Tao, YUAN Feng, LIU Yaling. Accurate Detection of Tree Planting Locations in Inner Mongolia for The Three North Project Based on YOLOv10-MHSA[J]. Smart Agriculture, 2025, 7(3): 108-119.

图/表 15

图1

图2

图3

图4

图5

表1

图6

表2

表3

图7

表4

表5

图8

图9

图10

参考文献 27

[1]	ARASUMANI M, BUNYAN M, ROBIN V V. Opportunities and challenges in using remote sensing for invasive tree species management, and in the identification of restoration sites in tropical montane grasslands[J]. Journal of environmental management, 2021, 280: ID 111759.
[2]	AL-ALI Z M, ABDULLAH M M, ASADALLA N B, et al. A comparative study of remote sensing classification methods for monitoring and assessing desert vegetation using a UAV-based multispectral sensor[J]. Environmental monitoring and assessment, 2020, 192(6): ID 389.
[3]	LI D J, XU D Y, WANG Z Y, et al. Ecological compensation for desertification control: A review[J]. Journal of geographical sciences, 2018, 28(3): 367-384.
[4]	HAO Z B, POST C J, MIKHAILOVA E A, et al. How does sample labeling and distribution affect the accuracy and efficiency of a deep learning model for individual tree-crown detection and delineation[J]. Remote sensing, 2022, 14(7): ID 1561.
[5]	KUMAR P, DEBELE S E, SAHANI J, et al. An overview of monitoring methods for assessing the performance of nature-based solutions against natural hazards[J]. Earth-science reviews, 2021, 217: ID 103603.
[6]	KOURGIALAS N N, KOUBOURIS G C, DOKOU Z. Optimal irrigation planning for addressing current or future water scarcity in Mediterranean tree crops[J]. Science of the total environment, 2019, 654: 616-632.
[7]	李妹燕, 李芬, 徐景秀. 基于机器学习方法的高光谱遥感图像目标检测研究[J]. 激光杂志, 2024, 45(10): 108-113.
	LI M Y, LI F, XU J X. Research on target detection in hyperspectral remote sensing images based on machine learning methods[J]. Laser journal, 2024, 45(10): 108-113.
[8]	林晓林, 孙俊. 基于机器学习的小目标检测与追踪的算法研究[J]. 计算机应用研究, 2018, 35(11): 3450-3453, 3457.
	LIN X L, SUN J. Research on small object detection and tracking algorithm based on machine learning[J]. Application research of computers, 2018, 35(11): 3450-3453, 3457.
[9]	叶昕怡, 高思莉, 李范鸣. 基于自适应对比度增强的红外小目标检测网络(英文)[J]. 红外与毫米波学报, 2023, 42(5): 701-710.
	YE X Y, GAO S L, LI F M. ACE-STDN: An infrared small target detection network with adaptive contrast enhancement[J]. Journal of infrared and millimeter waves, 2023, 42(5): 701-710.
[10]	彭小丹, 陈锋军, 朱学岩, 等. 基于无人机图像和改进LSC-CNN模型的密集苗木检测和计数方法[J]. 智慧农业(中英文), 2024, 6(5): 88-97.
	PENG X D, CHEN F J, ZHU X Y, et al. Dense nursery stock detecting and counting based on UAV aerial images and improved LSC-CNN[J]. Smart agriculture, 2024, 6(5): 88-97.
[11]	林两魁, 王少游, 唐忠兴. 基于深度卷积神经网络的红外过采样扫描图像点目标检测方法[J]. 红外与毫米波学报, 2018, 37(2): 219-226.
	LIN L K, WANG S Y, TANG Z X. Point target detection in infrared over-sampling scanning images using deep convolutional neural networks[J]. Journal of infrared and millimeter waves, 2018, 37(2): 219-226.
[12]	HAO Y, ZHANG C X, LI X Y. Research on defect detection method of bearing dust cover based on machine vision and multi-feature fusion algorithm[J]. Measurement science and technology, 2023, 34(10): ID 105016.
[13]	HUANG G B, BAI Z, KASUN L L C, et al. Local receptive fields based extreme learning machine[J]. IEEE computational intelligence magazine, 2015, 10(2): 18-29.
[14]	WU Y H, LIU Y, ZHANG L, et al. EDN: Salient object detection via extremely-downsampled network[J]. IEEE transactions on image processing, 2022, 31: 3125-3136.
[15]	LI S L, ZHANG S J, XUE J X, et al. A fast neural network based on attention mechanisms for detecting field flat jujube[J]. Agriculture, 2022, 12(5): ID 717.
[16]	ZHANG X, SONG Y, SONG T, et al. AKConv: Convolutional kernel with arbitrary sampled shapes and arbitrary number of parameters [EB/OL]. arXiv: 231111587, 2023.
[17]	NIU Z Y, ZHONG G Q, YU H. A review on the attention mechanism of deep learning[J]. Neurocomputing, 2021, 452: 48-62.
[18]	WU Z W, WANG X F, JIA M, et al. Dense object detection methods in RAW UAV imagery based on YOLOv8[J]. Scientific reports, 2024, 14: ID 18019.
[19]	DOMINIAK K N, KRISTENSEN A R. Prioritizing alarms from sensor-based detection models in livestock production: A review on model performance and alarm reducing methods[J]. Computers and electronics in agriculture, 2017, 133: 46-67.
[20]	LIU T, LU Y H, ZHANG Y, et al. A bone segmentation method based on multi-scale features fuse U2Net and improved dice loss in CT image process[J]. Biomedical signal processing and control, 2022, 77: ID 103813.
[21]	TAN H C, LIU X P, YIN B C, et al. MHSA-net: Multihead self-attention network for occluded person re-identification[J]. IEEE transactions on neural networks and learning systems, 2023, 34(11): 8210-8224.
[22]	JIN Y Q, MA J H, LIAN Y, et al. Cervical cytology screening using the fused deep learning architecture with attention mechanisms[J]. Applied soft computing, 2024, 166: ID 112202.
[23]	DU S J, ZHANG B F, ZHANG P, et al. An improved bounding box regression loss function based on CIOU loss for multi-scale object detection[C]// 2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML). Piscataway, New Jersey, USA: IEEE, 2021.
[24]	HUANG P P, TIAN S H, SU Y, et al. IA-CIOU: An improved IOU bounding box loss function for SAR ship target detection methods[J]. IEEE journal of selected topics in applied earth observations and remote sensing, 2024, 17: 10569-10582.
[25]	ZHANG Y F, REN W Q, ZHANG Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146-157.
[26]	SHEN Y Y, ZHANG F Z, LIU D, et al. Manhattan-distance IOU loss for fast and accurate bounding box regression and object detection[J]. Neurocomputing, 2022, 500: 99-114.
[27]	ZHAO Y, HRYNIEWICKI M K. XGBOD: Improving supervised outlier detection with unsupervised representation learning[C]// 2018 International Joint Conference on Neural Networks (IJCNN). Piscataway, New Jersey, USA: IEEE, 2018.

模型名称	网络深度	网络宽度	AP@0.5	AP@0.5：0.95	P/%	R/%	参数量/M
YOLOv10n	0.33	0.25	0.921	0.761	0.923	0.876	5.3
YOLOv10s	0.33	0.50	0.933	0.796	0.938	0.881	11.2
YOLOv10m	0.67	0.75	0.939	0.846	0.947	0.886	31.3
YOLOv10b	1.00	1.00	0.951	0.854	0.956	0.894	56.8

模型名称	AP@0.5	AP@0.5：0.95	P	R	FPS/（f/s）
YOLOv10n	0.921	0.761	0.923	0.876	134
+SA	0.925	0.752	0.931	0.887	132
+EMSA	0.931	0.763	0.942	0.854	126
+MHSA	0.934	0.774	0.938	0.886	130

损失函数	AP@0.5	AP@0.5：0.95	P	R	FPS/（f/s）
CIOU	0.921	0.761	0.923	0.876	134
SIOU	0.921	0.757	0.932	0.892	132
EIOU	0.928	0.762	0.941	0.871	124
Focal-EIOU	0.931	0.776	0.938	0.885	128

模型名称	AP@0.5	AP@0.5：0.95	P	R	FPS/（f/s）
YOLOv10n	0.921	0.761	0.923	0.876	134
+小目标检测层	0.941	0.768	0.932	0.882	119
+ AKConv	0.946	0.784	0.937	0.897	124
+MHSA	0.934	0.774	0.938	0.886	130
+ Focal-EIOU Loss	0.931	0.776	0.938	0.885	128
YOLOv10-MHSA	0.982	0.837	0.961	0.921	109

模型名称	评价指标
模型名称	AP@0.5	AP@0.5：0.95	P	R	FPS/（f/s）
YOLOv5s	0.897	0.698	0.841	0.812	138
YOLOv8n	0.915	0.734	0.867	0.795	121
YOLOv10n	0.921	0.761	0.923	0.876	134
SSD	0.784	0.624	0.792	0.743	67
Faster-R-CNN	0.837	0.703	0.823	0.802	58
YOLOv10-MHSA	0.982	0.837	0.961	0.921	109

基于YOLOv10-MHSA的“三北”工程内蒙古地区植树位点精准检测研究

Accurate Detection of Tree Planting Locations in Inner Mongolia for The Three North Project Based on YOLOv10-MHSA

在线阅读

知网下载

本地下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 27

相关文章 15

编辑推荐

Metrics

本文评价

[1]	黎祖胜, 唐吉深, 匡迎春. 基于改进YOLOv10n的轻量化荔枝虫害小目标检测模型[J]. 智慧农业(中英文), 2025, 7(2): 146-159.
[2]	司超国, 刘梦晨, 吴华瑞, 缪祎晟, 赵春江. Chilli-YOLO：基于改进YOLOv10的露地辣椒成熟度智能检测算法[J]. 智慧农业(中英文), 2025, 7(2): 160-171.
[3]	吴六爱, 许雪珂. 基于改进YOLOv10n的轻量化番茄叶片病虫害检测方法[J]. 智慧农业(中英文), 2025, 7(1): 146-155.
[4]	叶大鹏, 景均, 张之得, 李辉煌, 吴昊宇, 谢立敏. MSH-YOLOv8：融合尺度重建的蘑菇小目标检测方法[J]. 智慧农业(中英文), 2024, 6(5): 139-152.
[5]	彭小丹, 陈锋军, 朱学岩, 才嘉伟, 顾梦梦. 基于无人机图像和改进LSC-CNN模型的密集苗木检测和计数方法[J]. 智慧农业(中英文), 2024, 6(5): 88-97.
[6]	靳学萌, 梁西银, 邓鹏飞. 基于改进YOLOv10的轻量级黄花菜分级检测模型[J]. 智慧农业(中英文), 2024, 6(5): 108-118.
[7]	庞春晖, 陈鹏, 夏懿, 章军, 王兵, 邹岩, 陈天娇, 康辰瑞, 梁栋. 用于小麦多生长阶段倒伏边界精准检测的分层交互特征金字塔网络[J]. 智慧农业(中英文), 2024, 6(2): 128-139.
[8]	李政凯, 于嘉辉, 潘时佳, 贾泽丰, 牛子杰. 冬季猕猴桃树单木骨架提取与冠层生长预测方法[J]. 智慧农业(中英文), 2023, 5(4): 92-104.
[9]	龙佳宁, 张昭, 刘晓航, 李云霞, 芮照钰, 余江帆, 张漫, FLORES Paulo, 韩哲雄, 胡灿, 王旭峰. 利用改进EfficientNetV2和无人机图像检测小麦倒伏类型[J]. 智慧农业(中英文), 2023, 5(3): 62-74.
[10]	张淦, 严海峰, 胡根生, 张东彦, 程涛, 潘正高, 许海峰, 沈书豪, 朱科宇. 基于深度学习语义分割和迁移学习策略的麦田倒伏面积识别方法[J]. 智慧农业(中英文), 2023, 5(3): 75-85.
[11]	刘易雪, 宋育阳, 崔萍, 房玉林, 苏宝峰. 基于无人机遥感和深度学习的葡萄卷叶病感染程度诊断方法[J]. 智慧农业(中英文), 2023, 5(3): 49-61.
[12]	魏永康, 杨天聪, 丁信尧, 高越之, 袁鑫茹, 贺利, 王永华, 段剑钊, 冯伟. 基于不同空间分辨率无人机多光谱遥感影像的小麦倒伏区域识别方法[J]. 智慧农业(中英文), 2023, 5(2): 56-67.
[13]	赖佳政, 李贝贝, 程翔, 孙丰, 陈炬廷, 王晶, 张芊, 叶协锋. 基于无人机高光谱遥感的烤烟叶片叶绿素含量估测[J]. 智慧农业(中英文), 2023, 5(2): 68-81.
[14]	付虹雨, 王薇, 廖澳, 岳云开, 许明志, 王梓薇, 陈建福, 佘玮, 崔国贤. 基于无人机遥感表型监测的苎麻优质种质资源筛选方法[J]. 智慧农业(中英文), 2022, 4(4): 74-83.
[15]	卓越, 丁峰, 严海军, 徐婧. 无人机遥感在饲草作物生长监测中的应用研究进展[J]. 智慧农业(中英文), 2022, 4(4): 35-48.