Welcome to Smart Agriculture 中文

Smart Agriculture

   

Grading Asparagus officinalis L. Using Improved YOLOv11

YANG Qilang1,2,3, YU Lu1,2,3, LIANG Jiaping1,2,3()   

  1. 1. Faculty of Modern Agricultural Engineering, Kunming University of Science and Technology, Kunming 650500, China
    2. Yunnan Key Laboratory of Efficient Utilization and Intelligent Control of Agricultural Water Resources, Kunming 650500, China
    3. Yunnan International Joint Laboratory of Intelligent Agricultural Engineering Technology and Equipment, Kunming 650500, China
  • Received:2025-01-24 Online:2025-06-03
  • Foundation items:China National Funds for Distinguished Young Scientists(52209055); Yunnan Fundamental Research Projects(202501AU070148); Yunnan Province "Xing Dian Ying Talent Support Program" Young Talent Special Project(KKXX202423032); Yunnan Key Laboratory of Efficient Utilization and Intelligent Control of Agricultural Water Resources(202449CE340014); Yunnan International Joint Laboratory of Intelligent Agricultural Engineering Technology and Equipment(202403AP140007)
  • About author:

    YANG Qiliang, E-mail:

  • corresponding author:
    LIANG Jiaping, E-mail:

Abstract:

[Objective] Asparagus officinalis L. is a perennial plant with a long harvesting cycle and fast growth rate. The harvesting period of tender stems is relatively concentrated, and the shelf life of tender stems is very short. Therefore, the harvested asparagus needs to be classified according to the specifications of asparagus in a short time and then packaged and sold. However, at this stage, the classification of asparagus specifications basically depends on manual work, and it is difficult for asparagus of different specifications to rely on sensory grading, which requires a lot of money and labor. To save labor costs, an algorithm based on asparagus stem diameter classification was developed using deep learning and computer vision technology. This method selected YOLOv11 as the baseline model and makes several improvements, aiming to study a lightweight model for accurate grading of post-harvest asparagus. [Methods] This dataset was obtained by cell phone photography of post-harvest asparagus using fixed camera positions. In order to improve the generalization ability of the model, the training set was augmented with data by increasing contrast, mirroring, and adjusting brightness. The data-enhanced training set includes a total of 2 160 images for training the model. And the test set and validation set include 90 and 540 images respectively for inference and validation of the model. In order to enhance the performance of the improved model, the following four improvements were made to the baseline model, respectively. First, the efficient channel attention (ECA) module was added to the twelfth layer of the YOLOv11 backbone network. The ECA enhanced asparagus stem diameter feature extraction by dynamically adjusting channel weights in the convolutional neural network and improved the recognition accuracy of the improved model. Second, the bi-directional feature pyramid network (BiFPN) module was integrated into the neck network. This module modified the original feature fusion method to automatically emphasize key asparagus features and improved the grading accuracy through multi-scale feature fusion. What's more, BiFPN dynamically adjusted the importance of each layer to reduce redundant computations. Next, the slim-neck module was applied to optimize the neck network. The slim-neck Module consisted of GSConv and VOVGSCSP. The GSConv module replaced the traditional convolutional. And the VOVGSCSP module replaced the C2k3 module. This optimization reduced computational costs and model size while improving the recognition accuracy. Finally, the original YOLOv11 detection head was replaced with an EfficientDet Head. EfficientDet Head had the advantages of light weight and high accuracy. This head co-training with BiFPN to enhance the effect of multi-scale fusion and improve the performance of the model. [Results and Discussions] In order to verify the validity of the individual modules introduced in the improved YOLOv11 model and the superiority of the performance of the improved model, ablation experiments and comparison experiments were conducted respectively. The results of the comparison test between different attentional mechanisms added to the baseline model showed that the ECA module had better performance than other attentional mechanisms in the post-harvest asparagus grading task. The YOLOv11-ECA had higher recognition accuracy and smaller model size, so the selection of the ECA module had a certain degree of reliability. Ablation experiments demonstrated that the improved YOLOv11 achieved 96.8% precision (P), 96.9% recall (R), and 92.5% mean average precision (mAP), with 4.6 GFLOPs, 1.67 × 10⁶ parameters, and a 3.6 MB model. The results of the asparagus grading test indicated that the localization frames of the improved model were more accurate and had a higher and higher confidence level. Compared with the original YOLOv11 model, the improved YOLOv11 model increased the precision, recall, and mean average precision by 2.6, 1.4, and 2.2 percentage points, respectively. And the floating-point operation, parameter quantity, and model size were reduced by 1.7 G, 9.1 × 105, and 2.2 MB, respectively. Moreover, various improvements to the model could increase the accuracy of the model while ensuring that the model was light weight. In addition, the results of the comparative tests showed that the performance of the improved YOLOv11 model was better than those of SSD, YOLOv5s, YOLOv8n, YOLOv11, and YOLOv12. Overall, the improved YOLOv11 had the best overall performance, but still had some shortcomings. In terms of the real-time performance of the model, the inference speed of the improved model was not optimal, and the inference speed of the improved YOLOv11 was inferior to that of YOLOv5s and YOLOv8n. On this basis, to evaluate the inference speed of improved YOLOv11 and YOLOv11 used the aggregate test. The results of the Wilcoxon signed-rank test showed that the improved YOLOv11 had a significant improvement in inference speed compared to the original YOLOv11 model. [Conclusions] The improved YOLOv11 model demonstrated better recognition, lower parameters and floating-point operations, and smaller model size in the asparagus grading task. The improved YOLOv11 provided a theoretical foundation for intelligent post-harvest asparagus grading. Deploying the improved YOLOv11 model on asparagus grading equipment enables fast and accurate grading of post-harvest asparagus.

Key words: asparagus, object detection, image recognition, intelligent grading, YOLOv11

CLC Number: