Welcome to Smart Agriculture 中文

Smart Agriculture

   

Multi-scale Tea Leaf Disease Detection Based on Improved YOLOv11n

XIAO Ruihong1(), TAN Lixin1,2(), WANG Rifeng3(), SONG Min1,4, HU Chengxi5   

  1. 1. College of Information and Intelligence, Hunan Agricultural University, Changsha 410125, China
    2. School of Electrical and Electronic Engineering, Hunan College of Information, Changsha 410200, China
    3. Guangxi Science & Technology Normal University, School of Artificial Intelligence, Laibin 546199, China
    4. Changsha Preschool Education College, Changsha 410000, China
    5. Hunan Software Vocational and Technical University, Xiangtan 411100, China
  • Received:2025-09-05 Online:2025-11-14
  • Foundation items:Guangxi?Science and Technology Program(AD23026282)
  • corresponding author:
    1. TAN Lixin, E-mail: ; 2
    WANG Rifeng, E-mail: .

Abstract:

[Objective] Preventing and containing leaf diseases is a critical component of tea production, and accurate identification and localization of symptoms are essential for modern, automated plantation management. Field inspection in tea gardens poses distinctive challenges for vision-based algorithms: targets appeared at widely varying scales and morphologies under complex backgrounds and unfixed acquisition distances, which easily misled detectors. Models trained on standardized datasets with uniform distance and background often underperform, leading to false alarms and missed detections. To support method development under realistic constraints, YOLO-SADMFA (YOLO Switchable Atrous Dynamic Multi-scale Frequency-aware Adaptive), a detector based on the YOLOv11n backbone was proposed. The architecture aims to preserve fine details during repeated re-sampling (down- and up-sampling), strengthen modeling of lesions at varying scales, and refine multi-scale feature fusion. [Methods] The proposed architecture incorporates additional convolutional, feature extraction, upsampling, and detection head stages to better handle multi-scale representations, and introduces a DMF-Upsample (Dynamic Multi-scale Frequency-aware Upsample) module that performs upsampling through multi-scale feature analysis and dynamic frequency adjustment fusion. This module enables efficient multi-scale feature integration while effectively mitigating information loss during up- and down-sampling. Concretely, the DMF-Upsample analyzed multi-frequency responses from adjacent pyramid levels and fused them with dynamically learned frequency-selective weights, which preserved high-frequency lesion boundaries and textures while retaining low-frequency contextual structure such as leaf contours and global shading. A lightweight gating mechanism estimated per-location and per-channel coefficients to regulate the contribution of different bands, and a residual bypass preserved identity information to further reduce aliasing and oversmoothing introduced by repeated resampling. Furthermore, the baseline C3k2 block is replaced with a switchable atrous convolution (SAConv) module, which enhances multi-scale feature capture by combining outputs from different dilation rates and incorporates a weight locking mechanism weight-locking mechanism to improve model stability and performance. In practice, the SAConv aggregated parallel atrous branches at multiple dilation factors through learned coefficients under weight locking, which expanded the effective receptive field without sacrificing spatial resolution and suppressed gridding artifacts, while incurring modest parameter overhead. Lastly, an adaptive spatial feature fusion (ASFF) mechanism is integrated into the detection head, forming an ASFF-Head that learns spatially varying fusion weights across different feature scales, effectively filters conflicting information, and strengthens the model's robustness and overall detection accuracy. Together, these components formed a deeper yet efficient multi-scale pathway suited to complex field scenes. [Results and Discussions] Compared with the original YOLOv11n model, YOLO-SADMFA improved precision, recall, and mAP by 4.4, 8.4, and 3.7 percent points, respectively, indicating more reliable identification and localization across diverse field scenes. The detector was particularly effective for multi-scale targets where the lesion area occupied approximately 10%~65% of the image, reflecting the variability introduced by unfixed acquisition distance during tea garden patrols. Under low illumination and in complex backgrounds with occlusions and clutter, it maintained stable performance, reduced both missed detections and false alarms, and effectively distinguished disease categories with similar morphology and color. On edge computing devices, it sustained about 161 FPS, which met real-time requirements for mobile inspection robots and portable systems. These outcomes demonstrated strengthened robustness to background interference and improved sensitivity at extreme scales, which was consistent with practical demands where the acquisition distance was not fixed. From an ablation perspective, DMF-Upsample preserved high-frequency lesion boundaries while retaining low-frequency structural context after resampling, SAConv expanded receptive fields through multi-dilation aggregation under a weight-locking mechanism, and the ASFF-Head mitigated conflicts among feature pyramids. Their combination yielded cumulative gains in stability and accuracy. Qualitative analyses further supported the quantitative results: boundary localization improved for small, speckled lesions, large blotches were captured with fewer spurious edges, and distractors such as veins, shadows, and soil textures were less frequently misclassified, confirming the benefits of dynamic multi-scale frequency-aware fusion and adaptive spatial weighting in real field conditions. [Conclusions] The proposed YOLO-SADMFA effectively addressed the multi-scale disease detection challenge in complex tea garden environments, where acquisition distance was not fixed, lesion morphology and color were diverse, and cluttered backgrounds easily caused misjudgments and omissions. It significantly improved detection accuracy and robustness relative to the original YOLOv11n model across a wide range of target scales, and it maintained stable performance under low illumination and complex backgrounds typical of field inspections. It provided reliable technical support for automated tea leaf disease inspection systems by enabling accurate localization and identification of lesions in real operating conditions and by sustaining real-time inference on edge devices suitable for patrol-style deployment. It therefore had important application value for promoting the intelligent development of the tea industry, supporting large-scale, standardized, and continuous field monitoring and management, and offering a practical foundation for engineering implementation and subsequent optimization in real-world tea plantations.

Key words: YOLO, tea leaf diseases, object detection, DMF-Upsample, ASFF

CLC Number: