基于改进YOLOv10n的轻量化荔枝虫害小目标检测模型

doi:10.12133/j.smartag.SA202412003

摘要/Abstract

摘要：

【目的/意义】 荔枝虫害的精准识别有助于实施有效的防治策略，推动农业的可持续发展。为提高荔枝虫害的识别效率，本研究提出一种基于改进YOLOv10n的轻量化目标检测模型YOLO-LP（YOLO-Litchi Pests）。 【方法】 首先，优化主干网络（Backbone）的C2f模块，使用全局到局部空间聚合模块（Global-to-Local Spatial Aggregation, GLSA）构建C2f_GLSA模块，实现对小目标的高效聚焦，增强目标与背景的区分能力，同时减少参数量和计算量。其次，在颈部网络（Neck）引入频率感知特征融合模块（Frequency-Aware Feature Fusion, FreqFusion），设计频域感知路径聚合网络（Frequency-Aware Path Aggregation Network, FreqPANet），有效解决目标边界模糊和偏移的问题，并进一步轻量化模型。最后，使用SCYLLA-IoU（SIoU）损失函数替代Complete-IoU（CIoU）损失函数，优化目标定位精度并加速模型训练收敛过程。为了评估模型性能，本研究在自然环境和实验室环境的四种场景中，构建自建的荔枝虫害小目标数据集并进行测试。 【结果和讨论】 YOLO-LP在AP₅₀、AP_50：95、AP-Small_50：95分别达到了90.9%、62.2%和59.5%，较基线模型分别提高了1.9个百分点、1.0个百分点和1.2个百分点。同时，模型的参数量和计算量分别减少13%和17%。 【结论】 YOLO-LP在精度和轻量化方面表现优越，为荔枝虫害检测的实际应用提供了有效的参考。

关键词: 荔枝, 虫害检测, 多场景, 小目标, YOLOv10n, 特征融合, 注意力机制

Abstract:

[Objective] The accuracy of identifying litchi pests is crucial for implementing effective control strategies and promoting sustainable agricultural development. However, the current detection of litchi pests is characterized by a high percentage of small targets, which makes target detection models challenging in terms of accuracy and parameter count, thus limiting their application in real-world production environments. To improve the identification efficiency of litchi pests, a lightweight target detection model YOLO-LP (YOLO-Litchi Pests) based on YOLOv10n was proposed. The model aimed to enhance the detection accuracy of small litchi pest targets in multiple scenarios by optimizing the network structure and loss function, while also reducing the number of parameters and computational costs. [Methods] Two classes of litchi insect pests (Cocoon and Gall) images were collected as datasets for modeling in natural scenarios (sunny, cloudy, post-rain) and laboratory environments. The original data were expanded through random scaling, random panning, random brightness adjustments, random contrast variations, and Gaussian blurring to balance the category samples and enhance the robustness of the model, generating a richer dataset named the CG dataset (Cocoon and Gall dataset). The YOLO-LP model was constructed after the following three improvements. Specifically, the C2f module of the backbone network (Backbone) in YOLOv10n was optimized and the C2f_GLSA module was constructed using the global-to-local spatial aggregation (GLSA) module to focus on small targets and enhance the differentiation between the targets and the backgrounds, while simultaneously reducing the number of parameters and computation. A frequency-aware feature fusion module (FreqFusion) was introduced into the neck network (Neck) of YOLOv10n and a frequency-aware path aggregation network (FreqPANet) was designed to reduce the complexity of the model and address the problem of fuzzy and shifted target boundaries. The SCYLLA-IoU (SIoU) loss function replaced the Complete-IoU (CIoU) loss function from the baseline model to optimize the target localization accuracy and accelerate the convergence of the training process. [Results and Discussions] YOLO-LP achieved 90.9%, 62.2%, and 59.5% for AP₅₀, AP_50:95, and AP-Small_50:95 in the CG dataset, respectively, and 1.9%, 1.0%, and 1.2% higher than the baseline model. The number of parameters and the computational costs were reduced by 13% and 17%, respectively. These results suggested that YOLO-LP had a high accuracy and lightweight design. Comparison experiments with different attention mechanisms validated the effectiveness of the GLSA module. After the GLSA module was added to the baseline model, AP₅₀, AP_50:95, and AP-Small_50:95 achieved the highest performance in the CG dataset, reaching 90.4%, 62.0%, and 59.5%, respectively. Experiment results comparing different loss functions showed that the SIoU loss function provided better fitting and convergence speed in the CG dataset. Ablation test results revealed that the validity of each model improvement and the detection performance of any combination of the three improvements was significantly better than the baseline model in the YOLO-LP model. The performance of the models was optimal when all three improvements were applied simultaneously. Compared to several mainstream models, YOLO-LP exhibited the best overall performance, with a model size of only 5.1 MB, 1.97 million parameters (Params), and a computational volume of 5.4 GFLOPs. Compared to the baseline model, the detection of the YOLO-LP performance was significantly improved across four multiple scenarios. In the sunny day scenario, AP₅₀, AP_50:95, and AP-Small_50:95 increased by 1.9%, 1.0 %, and 2.0 %, respectively. In the cloudy day scenario, AP₅₀, AP_50:95, and AP-Small_50:95 increased by 2.5%, 1.3%, and 1.3%, respectively. In the post-rain scenario, AP₅₀, AP_50:95, and AP-Small_50:95 increased by 2.0%, 2.4%, and 2.4%, respectively. In the laboratory scenario, only AP₅₀ increased by 0.7% over the baseline model. These findings indicated that YOLO-LP achieved higher accuracy and robustness in multi-scenario small target detection of litchi pests. [Conclusions] The proposed YOLO-LP model could improve detection accuracy and effectively reduce the number of parameters and computational costs. It performed well in small target detection of litchi pests and demonstrated strong robustness across different scenarios. These improvements made the model more suitable for deployment on resource-constrained mobile and edge devices. The model provided a valuable technical reference for small target detection of litchi pests in various scenarios.

Key words: litchi, pests detection, multi-scenario, small targets, YOLOv10n, feature fusion, attention mechanism

中图分类号:

TP181

黎祖胜, 唐吉深, 匡迎春. 基于改进YOLOv10n的轻量化荔枝虫害小目标检测模型[J]. 智慧农业(中英文), 2025, 7(2): 146-159.

LI Zusheng, TANG Jishen, KUANG Yingchun. A Lightweight Model for Detecting Small Targets of Litchi Pests Based on Improved YOLOv10n[J]. Smart Agriculture, 2025, 7(2): 146-159.

图/表 20

图1

图2

表1

图3

图4

图5

图6

图7

图8

图9

图10

表2

表3

表4

表5

图11

表6

表7

图12

图13

参考文献 26

[1]	齐文娥, 陈厚彬, 罗滔, 等. 中国大陆荔枝产业发展现状、趋势与对策[J]. 广东农业科学, 2019, 46(10): 132-139.
	QI W E, CHEN H B, LUO T, et al. Development status, trend and suggestion of Litchi industry in China's mainland[J]. Guangdong agricultural sciences, 2019, 46(10): 132-139.
[2]	陈厚彬, 杨胜男, 苏钻贤, 等. 2024年全国荔枝生产形势分析与管理建议[J]. 中国热带农业, 2024(3): 8-20.
	CHEN H B, YANG S N, SU Z X, et al. Analysis of the national Litchi production in 2024 and management suggestions[J]. China tropical agriculture, 2024(3): 8-20.
[3]	刘冬梅, 杨杭旭, 周宏平, 等. 茶树植保机械及减量施药技术研究进展[J]. 中国农机化学报, 2021, 42(9): 59-67.
	LIU D M, YANG H X, ZHOU H P, et al. Research progress of tea tree protection machinery and reduced pesticide application technology[J]. Journal of Chinese agricultural mechanization, 2021, 42(9): 59-67.
[4]	白荻, 王寅凯, 熊燕华. 基于集成学习的茶树病虫害检测方法[J/OL]. 南京农业大学学报. (2024-08-01)[2024-11-23].
	BAI D, WANG Y K, XIONG Y H. Development and experiment of Panonychus citri infestation fast detector[J/OL]. Journal of Nanjing agricultural university. (2024-08-01)[2024-11-23].
[5]	牛冲, 牛昱光, 李寒, 等. 基于图像灰度直方图特征的草莓病虫害识别[J]. 江苏农业科学, 2017, 45(4):169-172.
	NIU C, NIU Y G, LI H, et al. Strawberry pest and disease recognition based on image gray histogram feature[J]. Jiangsu agricultural sciences, 2017, 45(4): 169-172.
[6]	XIE J X, ZHANG X W, LIU Z Q, et al. Detection of Litchi leaf diseases and insect pests based on improved FCOS[J]. Agronomy, 2023, 13(5): ID 1314.
[7]	欧善国, 张桂香, 彭晓丹. 荔枝病虫害图像识别技术研究和应用[J]. 农业工程, 2020, 10(11): 29-35.
	OU S G, ZHANG G X, PENG X D. Research and application of image recognition technology for Litchi diseases and insect pests[J]. Agricultural engineering, 2020, 10(11): 29-35.
[8]	叶进, 邱文杰, 杨娟, 等. 基于深度学习的荔枝虫害识别方法[J]. 实验室研究与探索, 2021, 40(6): 29-32.
	YE J, QIU W J, YANG J, et al. Litchi pest identification method based on deep learning[J]. Research and exploration in laboratory, 2021, 40(6): 29-32.
[9]	XIAO J Y, KANG G B, WANG L H, et al. Real-time lightweight detection of lychee diseases with enhanced YOLOv7 and edge computing[J]. Agronomy, 2023, 13(12): ID 2866.
[10]	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block attention module[C]// Computer Vision – ECCV 2018. Cham, Germany: Springer, 2018: 3-19.
[11]	彭红星, 何慧君, 高宗梅, 等. 基于改进ShuffleNetV2模型的荔枝病虫害识别方法[J]. 农业机械学报, 2022, 53(12): 290-300.
	PENG H X, HE H J, GAO Z M, et al. Litchi diseases and insect pests identification method based on improved ShuffleNetV2[J]. Transactions of the Chinese society for agricultural machinery, 2022, 53(12): 290-300.
[12]	谢家兴, 陈斌瀚, 彭家骏, 等. 基于改进ShuffleNetV2的荔枝叶片病虫害图像识别[J]. 果树学报, 2023, 40(5): 1024-1035.
	XIE J X, CHEN B H, PENG J J, et al. Image recognition of Litchi leaf diseases and insect pests based on improved ShuffleNetV2[J]. Journal of fruit science, 2023, 40(5): 1024-1035.
[13]	王卫星, 刘泽乾, 高鹏, 等. 基于改进YOLOv4的荔枝病虫害检测模型[J]. 农业机械学报, 2023, 54(5): 227-235.
	WANG W X, LIU Z Q, GAO P, et al. Detection of Litchi diseases and insect pests based on improved YOLOv4 model[J]. Transactions of the Chinese society for agricultural machinery, 2023, 54(5): 227-235.
[14]	LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]// Computer Vision – ECCV 2014. Cham, Germany: Springer, 2014: 740-755.
[15]	WANG A, CHEN H, LIU L H, et al. YOLOv10: Real-time end-to-end object detection[EB/OL]. arXiv: 2405.14458, 2024.
[16]	TANG F L, XU Z X, HUANG Q M, et al. DuAT: dual-aggregation transformer network for medical image segmentation[C]// Chinese Conference on Pattern Recognition and Computer Vision (PRCV). Berlin, Germany: Springer, 2023: 343-356.
[17]	SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[C]// 2017 IEEE International Conference on Computer Vision. Piscataway, New Jersey, USA: IEEE, 2017: 618-626.
[18]	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, New Jersey, USA: IEEE, 2018: 8759-8768.
[19]	CHEN L W, FU Y, GU L, et al. Frequency-aware feature fusion for dense image prediction[J]. IEEE transactions on pattern analysis and machine intelligence, 46(12): 10763-10780.
[20]	ZHENG Z H, WANG P, REN D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE transactions on cybernetics, 52(8): 8574-8586.
[21]	ZHOU D F, FANG J, SONG X B, et al. IoU loss for 2D/3D object detection[C]// 2019 International Conference on 3D Vision (3DV). Piscataway, New Jersey, USA: IEEE, 2019: 85-94.
[22]	GEVORGYAN Z. SIoU loss: More powerful learning for bounding box regression[EB/OL]. arXiv: 2205.12740, 2022.
[23]	YU Y, ZHANG Y, CHENG Z Y, et al. MCA: Multidimensional collaborative attention in deep convolutional neural networks for image recognition[J]. Engineering applications of artificial intelligence, 2023, 126: ID 107079.
[24]	SUN H, WEN Y, FENG H J, et al. Unsupervised bidirectional contrastive reconstruction and adaptive fine-grained channel attention networks for image dehazing[J]. Neural networks, 2024, 176: ID 106314.
[25]	XU S B, ZHENG S C, XU W H, et al. HCF-net: Hierarchical context fusion network for infrared small object detection[EB/OL]. arXiv: 2403.10778, 2024.
[26]	DAI W, LIU R, WU Z X, et al. Exploiting scale-variant attention for segmenting small medical objects[EB/OL]. arXiv: 2407.07720, 2024.

类别	数据增强前训练集目标数/个	数据增强后训练集目标数/个	验证集目标数/个	测试集目标数/个
Cocoon	1 005	7 981	347	333
Gall	5 009	7 528	1 598	1 577

参数名	参数值
训练轮次/Epoch	200
训练图像尺寸/DPI	640×640
批次尺寸	8
优化器	SGD
初始学习率	0.01
权重衰减	0.000 5

Model	Model size/MB	AP₅₀/%	AP_50：95/%	AP-Small_50：95/%	GFLOPs	Params/M	FPS
YOLOv10n+C2f_MCA	4.56	89.3	61.1	58.4	5.1	1.78	162.2
YOLOv10n+C2f_CBAM	4.56	89.3	61.1	58.3	5.1	1.78	204.1
YOLOv10n+C2f_FCA	4.59	89.2	61.1	58.4	5.1	1.81	200.7
YOLOv10n+C2f_PPA	6.37	89.6	61.5	58.6	7.5	2.68	108.3
YOLOv10n+C2f_MCAttention	4.57	89.0	60.9	58.1	5.1	1.79	233.9
YOLOv10n+C2f_GLSA	5.33	90.4	62.0	59.5	5.7	2.11	153.2

Baseline	C2f_GLSA	FreqPANet	SIoU	AP₅₀/%	AP_50：95/%	AP-Small_50：95 /%	GFLOPs	Params/M	FPS
YOLOv10n	×	×	×	89.0	61.2	58.3	6.5	2.27	259.3
	√	×	×	90.4	62.0	59.5	5.7	2.11	153.2
	×	√	×	89.4	61.3	58.4	6.2	2.12	169.5
	×	×	√	89.8	61.9	59.2	6.5	2.27	257.1
	√	√	×	90.3	61.8	59.3	5.4	1.97	121.9
	√	×	√	89.7	61.5	58.7	5.7	2.11	150.7
	×	√	√	90.2	62.0	59.4	6.2	2.12	168.8
	√	√	√	90.9	62.2	59.5	5.4	1.97	122.6

Model	Loss function	AP₅₀/%	AP_50：95/%	AP-Small_50：95/%
YOLOv10n+C2f_GLSA+FreqPANet	CIoU	90.3	61.8	59.3
	DIoU	90.0	61.5	58.7
	GIoU	90.0	61.7	59.1
	EIoU	90.3	61.8	58.9
	SIoU	90.9	62.2	59.5