欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

基于YOLO与扩散模型的冠层环境灰茶尺蠖幼虫检测方法

罗学论1, GOUDA Mostafa1,2, 宋馨蓓1, 胡妍1, 张文凯1, 何勇1, 张瑾3,4(), 李晓丽1()   

  1. 1. 浙江大学 生物系统工程与食品科学学院,浙江 杭州 310058,中国
    2. 国家研究中心营养与食品科学部,吉萨省杜基区 12622,埃及
    3. 茶树种质创新与资源利用全国重点实验室,浙江 杭州 310008,中国
    4. 中国农业科学院茶叶研究所,浙江 杭州 310008,中国
  • 收稿日期:2025-05-23 出版日期:2025-08-13
  • 基金项目:
    国家自然科学基金(32171889); 浙江省科技计划项目“尖兵”“领雁”研发攻关计划(2023C02043;2022C02044;2023C020009)
  • 作者简介:

    罗学论,博士,研究方向为光谱数据挖掘、茶叶信息感知等。E-mail:

  • 通信作者:
    张 瑾,副研究员,研究方向为茶树抗性育种。E-mail:
    李晓丽,教授,研究方向为农业遥感、茶叶信息感知。E-mail:

Detection Method of Tea Geometrid Larvae in Canopy Environments Based on YOLO and Diffusion Models

LUO xuelun1, GOUDA Mostafa1,2, SONG xinbei1, HU yan1, ZHANG wenkai1, HE yong1, ZHANG jin3,4(), LI Xiaoli1()   

  1. 1. College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, China
    2. Department of Nutrition & Food Science, National Research Centre, Dokki 12622, Egypt
    3. State Key Laboratory of Tea Plant Germplasm Innovation and Resource Utilization, Ministry of Agriculture, Hangzhou 310008, China
    4. Tea Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou 310008, China
  • Received:2025-05-23 Online:2025-08-13
  • Foundation items:The National Natural Science Foundation of China(32171889); The Key R&D Projects in Zhejiang Province(2023C02043;2022C02044;2023C020009)
  • About author:

    LUO Xuelun, E-mail:

  • Corresponding author:
    ZHANG Jin, E-mail: ;
    LI Xiaoli, E-mail:

摘要:

【目的/意义】 灰茶尺蠖(Ectropis grisescens)幼虫对茶树的危害是当前茶叶生产面临的主要生物胁迫之一,实现其早期精准检测具有重要的生产实践意义。 【方法】 本研究提出了一种融合可控扩散模型与目标检测深度学习框架的高效识别方法,用于茶树冠层中灰茶尺蠖幼虫四个龄期的实时检测。研究构建了三级检测体系:全龄期检测(1—4龄)、虫龄段检测(1—2龄与3—4龄分组),以及精准龄期检测(各龄期独立识别)。研究引入可控扩散模型,并创新性地提出数据集优化与高质量图像筛选策略,旨在提升YOLO系列模型(YOLOv8、v9、v10、v11)在灰茶尺蠖数据集上的检测性能。[结果与讨论] 全龄期检测中YOLO系列模型的最佳平均mAP50达到0.904,虫龄段检测的最佳平均mAP50为0.862,精准龄期检测的最佳平均mAP50为0.697。值得注意的是,可控扩散模型的引入使YOLO系列模型的性能获得普遍提升,其中YOLOv10在三类检测任务中的提升最为显著(配对t检验,p<0.05),平均mAP50从0.811提升至0.821。综合比较发现,YOLOv9在灰茶尺蠖检测中表现最优,其三类检测任务的平均mAP50达0.826,F1值为0.767。 【结论】 本研究证实,基于可控扩散模型与深度学习相结合的创新方法,能够有效实现田间灰茶尺蠖幼虫各龄期的精准识别,为茶园灰茶尺蠖智能监测提供了可靠的理论基础和技术支撑。

关键词: 灰茶尺蠖幼虫, 目标检测, 茶冠层环境, YOLO, 扩散模型

Abstract:

[Objective] Tea has become one of the most important economic crops globally, driven by the growing popularity of tea-based beverages. However, tea production is increasingly threatened by biotic stressors, among which the Ectropis grisescens stands out as a major defoliating pest. The larvae of this moth species cause substantial damage to tea plants by feeding on their leaves, thereby reducing yield and affecting the overall quality of the crop. The manual methods are not only time-consuming and labor-intensive but also suffer from low efficiency, high costs, and considerable subjectivity. In this context, the development of intelligent, accurate, and automated early detection techniques for tea geometrid larvae is of vital significance. Such advancements hold the potential to enhance pest management strategies, reduce economic losses, and promote sustainable tea cultivation practices. [Methods] The recognition framework was proposed to achieve real-time and fine-grained identification of E. grisescens larvae at four distinct instar stages within complex tea canopy environments. To capture the varying morphological characteristics across developmental stages, a hierarchical three-level detection system was designed, consisting of: (1) full-instar detection covering all instars from the 1st to the 4th, (2) grouped-stage detection that classified larvae into early (1st–2nd) and late (3rd–4th) instar stages, and (3) fine-grained detection targeting each individual instar stage separately. Given the challenges posed by limited, imbalanced, and noisy training data—common issues in field-based entomological image datasets— a semi-automated dataset optimization strategy was introduced to enhance data quality and improve class representation. Building upon this refined dataset, a controllable diffusion model was employed to generate a large number of high-resolution, labeled synthetic images that emulated real-world appearances of tea geometrid larvae under diverse environmental conditions. To ensure the reliability and utility of the generated data, a novel high-quality image filtering strategy was developed that automatically evaluated and selected images containing accurate, detailed, and visually realistic larval instances. The filtered synthetic images were then strategically integrated into the real training dataset, effectively augmenting the data and enhancing the diversity and balance of training samples. This comprehensive data augmentation pipeline led to substantial improvements in the detection performance of multiple YOLO-series models (YOLOv8, YOLOv9, YOLOv10, and YOLOv11). [Results and Discussions] Experimental results clearly demonstrated that the YOLO series models exhibited strong and consistent performance across a range of detection tasks involving Ectropis grisescens larvae. In the full-instar detection task, which targeted the identification of all larval stages from 1st to 4th instars, the best-performing YOLO model achieved an impressive average mAP@50 of 0.904, indicating a high level of detection precision. In the grouped instar-stage detection task, where larvae were classified into early (1st–2nd) and late (3rd–4th) instar groups, the highest mAP@50 recorded was 0.862, reflecting the model's ability to distinguish developmental clusters with reasonable accuracy. For the more challenging fine-grained individual instar detection task—requiring the model to discriminate among each instar stage independently—the best mAP@50 reached 0.697, demonstrating the feasibility of detailed stage-level classification despite subtle morphological differences. The proposed semi-automated data optimization strategy contributed significantly to performance improvements, particularly for the YOLOv8 model. Specifically, YOLOv8 showed consistent gains in mAP@50 across all three detection tasks, with absolute improvements of 0.024, 0.027, and 0.022 for full-instar, grouped-stage, and fine-grained detection tasks, respectively. These enhancements underscored the effectiveness of the dataset refinement process in addressing issues related to data imbalance and noise. Furthermore, the incorporation of the controllable diffusion model led to a universal performance boost across all YOLO variants. Notably, YOLOv10 exhibited the most substantial gains among the evaluated models, with its average mAP@50 increasing from 0.811 to 0.821 across the three detection tasks. This improvement was statistically significant, as confirmed by a paired t-test (p < 0.05), suggesting that the synthetic images generated by the diffusion model effectively enriched the training data and improved model generalization. Among all evaluated models, YOLOv9 achieved the best overall performance in detecting Ectropis grisescens larvae. It attained top mAP@50 scores of 0.909, 0.869, and 0.702 in the full-instar, grouped-stage, and fine-grained detection tasks, respectively. When averaged across all tasks, YOLOv9 reached a mean mAP@50 of 0.826, accompanied by a macro F1-Score of 0.767, highlighting its superior balance between precision and recall. [Conclusion] This study demonstrated that the integration of a controllable diffusion model with deep learning enabled accurate field-level instar detection of Ectropis grisescens providing a reliable theoretical and technical foundation for intelligent pest monitoring in tea plantations.

Key words: Ectropis grisescens larvae, object detection, tea canopy environments, YOLO, diffusion models

中图分类号: