Welcome to Smart Agriculture 中文

Smart Agriculture

   

Detection Method of Tea Geometrid Larvae in Canopy Environments Based on YOLO and Diffusion Models

LUO xuelun1, GOUDA Mostafa1,2, SONG xinbei1, HU yan1, ZHANG wenkai1, HE yong1, ZHANG jin3,4(), LI Xiaoli1()   

  1. 1. College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, China
    2. Department of Nutrition & Food Science, National Research Centre, Dokki 12622, Egypt
    3. State Key Laboratory of Tea Plant Germplasm Innovation and Resource Utilization, Ministry of Agriculture, Hangzhou 310008, China
    4. Tea Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou 310008, China
  • Received:2025-05-23 Online:2025-08-13
  • Foundation items:The National Natural Science Foundation of China(32171889); The Key R&D Projects in Zhejiang Province(2023C02043;2022C02044;2023C020009)
  • About author:

    LUO Xuelun, E-mail:

  • corresponding author:
    ZHANG Jin, E-mail: ;
    LI Xiaoli, E-mail:

Abstract:

[Objective] Tea has become one of the most important economic crops globally, driven by the growing popularity of tea-based beverages. However, tea production is increasingly threatened by biotic stressors, among which the Ectropis grisescens stands out as a major defoliating pest. The larvae of this moth species cause substantial damage to tea plants by feeding on their leaves, thereby reducing yield and affecting the overall quality of the crop. The manual methods are not only time-consuming and labor-intensive but also suffer from low efficiency, high costs, and considerable subjectivity. In this context, the development of intelligent, accurate, and automated early detection techniques for tea geometrid larvae is of vital significance. Such advancements hold the potential to enhance pest management strategies, reduce economic losses, and promote sustainable tea cultivation practices. [Methods] The recognition framework was proposed to achieve real-time and fine-grained identification of E. grisescens larvae at four distinct instar stages within complex tea canopy environments. To capture the varying morphological characteristics across developmental stages, a hierarchical three-level detection system was designed, consisting of: (1) full-instar detection covering all instars from the 1st to the 4th, (2) grouped-stage detection that classified larvae into early (1st–2nd) and late (3rd–4th) instar stages, and (3) fine-grained detection targeting each individual instar stage separately. Given the challenges posed by limited, imbalanced, and noisy training data—common issues in field-based entomological image datasets— a semi-automated dataset optimization strategy was introduced to enhance data quality and improve class representation. Building upon this refined dataset, a controllable diffusion model was employed to generate a large number of high-resolution, labeled synthetic images that emulated real-world appearances of tea geometrid larvae under diverse environmental conditions. To ensure the reliability and utility of the generated data, a novel high-quality image filtering strategy was developed that automatically evaluated and selected images containing accurate, detailed, and visually realistic larval instances. The filtered synthetic images were then strategically integrated into the real training dataset, effectively augmenting the data and enhancing the diversity and balance of training samples. This comprehensive data augmentation pipeline led to substantial improvements in the detection performance of multiple YOLO-series models (YOLOv8, YOLOv9, YOLOv10, and YOLOv11). [Results and Discussions] Experimental results clearly demonstrated that the YOLO series models exhibited strong and consistent performance across a range of detection tasks involving Ectropis grisescens larvae. In the full-instar detection task, which targeted the identification of all larval stages from 1st to 4th instars, the best-performing YOLO model achieved an impressive average mAP@50 of 0.904, indicating a high level of detection precision. In the grouped instar-stage detection task, where larvae were classified into early (1st–2nd) and late (3rd–4th) instar groups, the highest mAP@50 recorded was 0.862, reflecting the model's ability to distinguish developmental clusters with reasonable accuracy. For the more challenging fine-grained individual instar detection task—requiring the model to discriminate among each instar stage independently—the best mAP@50 reached 0.697, demonstrating the feasibility of detailed stage-level classification despite subtle morphological differences. The proposed semi-automated data optimization strategy contributed significantly to performance improvements, particularly for the YOLOv8 model. Specifically, YOLOv8 showed consistent gains in mAP@50 across all three detection tasks, with absolute improvements of 0.024, 0.027, and 0.022 for full-instar, grouped-stage, and fine-grained detection tasks, respectively. These enhancements underscored the effectiveness of the dataset refinement process in addressing issues related to data imbalance and noise. Furthermore, the incorporation of the controllable diffusion model led to a universal performance boost across all YOLO variants. Notably, YOLOv10 exhibited the most substantial gains among the evaluated models, with its average mAP@50 increasing from 0.811 to 0.821 across the three detection tasks. This improvement was statistically significant, as confirmed by a paired t-test (p < 0.05), suggesting that the synthetic images generated by the diffusion model effectively enriched the training data and improved model generalization. Among all evaluated models, YOLOv9 achieved the best overall performance in detecting Ectropis grisescens larvae. It attained top mAP@50 scores of 0.909, 0.869, and 0.702 in the full-instar, grouped-stage, and fine-grained detection tasks, respectively. When averaged across all tasks, YOLOv9 reached a mean mAP@50 of 0.826, accompanied by a macro F1-Score of 0.767, highlighting its superior balance between precision and recall. [Conclusion] This study demonstrated that the integration of a controllable diffusion model with deep learning enabled accurate field-level instar detection of Ectropis grisescens providing a reliable theoretical and technical foundation for intelligent pest monitoring in tea plantations.

Key words: Ectropis grisescens larvae, object detection, tea canopy environments, YOLO, diffusion models

CLC Number: