欢迎您访问《智慧农业(中英文)》官方网站! English

Smart Agriculture

• •    

基于改进YOLOv11的采后芦笋分级检测方法

杨启良1,2,3, 禹璐1,2,3, 梁嘉平1,2,3()   

  1. 1. 昆明理工大学 现代农业工程学院,云南 昆明 650500,中国
    2. 云南省农业水资源高效利用与智慧管控重点实验室,云南 昆明 650500,中国
    3. 云南省智能农业工程技术与装备国际联合实验室,云南 昆明 650500,中国
  • 收稿日期:2025-01-24 出版日期:2025-06-03
  • 基金项目:
    国家自然科学基金青年基金(52209055); 云南省基础研究计划项目(202501AU070148); 云南省“兴滇英才支持计划”青年人才专项项目(KKXX202423032); 云南省农业水资源高效利用与智慧管控重点实验室(202449CE340014); 云南省智能农业工程技术与装备国际联合实验室(202403AP140007)
  • 作者简介:

    杨启良,博士,教授,研究方向为农业信息化。E-mail:

  • 通信作者:
    梁嘉平,博士,副教授,研究方向为农业信息化。E-mail:

Grading Asparagus officinalis L. Using Improved YOLOv11

YANG Qilang1,2,3, YU Lu1,2,3, LIANG Jiaping1,2,3()   

  1. 1. Faculty of Modern Agricultural Engineering, Kunming University of Science and Technology, Kunming 650500, China
    2. Yunnan Key Laboratory of Efficient Utilization and Intelligent Control of Agricultural Water Resources, Kunming 650500, China
    3. Yunnan International Joint Laboratory of Intelligent Agricultural Engineering Technology and Equipment, Kunming 650500, China
  • Received:2025-01-24 Online:2025-06-03
  • Foundation items:China National Funds for Distinguished Young Scientists(52209055); Yunnan Fundamental Research Projects(202501AU070148); Yunnan Province "Xing Dian Ying Talent Support Program" Young Talent Special Project(KKXX202423032); Yunnan Key Laboratory of Efficient Utilization and Intelligent Control of Agricultural Water Resources(202449CE340014); Yunnan International Joint Laboratory of Intelligent Agricultural Engineering Technology and Equipment(202403AP140007)
  • About author:

    YANG Qiliang, E-mail:

  • Corresponding author:
    LIANG Jiaping, E-mail:

摘要:

【目的/意义】 针对采后芦笋在销售前人工分级成本高、效率低的问题,提出了一种基于改进YOLOv11模型的采后芦笋分级方法,旨在研究一种轻量化的采后芦笋的精准分级模型。 【方法】 首先,在主干网络的第12层引入高效通道注意力(Efficient Channel Attention, ECA)机制,ECA机制通过动态调整卷积神经网络中通道的权重,以增强对芦笋茎粗特征的提取能力;其次,在颈部网络同时引入slim-neck模块和双向特征金字塔(Bi-directional Feature Pyramid Network, BiFPN)模块,slim-neck将传统卷积替换为GSConv并将C3k2模块替换为轻量级跨阶段部分网络(Voice of Voter-group Shuffle Cross Stage Partial Network, VoVGSCSP)模块,减少运算量与模型大小,提高模型的识别精度,BiFPN模块改变原有的特征融合方式,能够自动强化芦笋的关键特征并减少冗余的计算;最后,将YOLOv11中原始的检测头替换为EfficientDet Head,与BiFPN联合训练,能够充分利用多尺度特征,有效提高模型性能。[结果与讨论]改进后的YOLOv11的精确率为96.8%、召回率为96.9%、平均精度均值(Mean Average Precision, mAP)为92.5%、浮点运算量为4.6 G、参数量为1.67×106、模型大小为3.6 MB,与原始YOLOv11模型相比,改进后的YOLOv11模型的精确率、召回率、mAP分别提升了2.6、1.4、2.2个百分点,同时,浮点运算量、参数量、模型大小也显著降低;与其他深度模型SSD、YOLOv5s、YOLOv8n、YOLOv11、YOLOv12相比,改进后的YOLOv11模型综合性能最佳。 【结论】 改进的YOLOv11模型在芦笋分级任务中展现出了更好的识别效果、更少的参数量和浮点运算量和更小的模型大小,能够为采后芦笋智能化分级提供理论基础。

关键词: 芦笋, 目标检测, 图像识别, 智能分级, YOLOv11

Abstract:

[Objective] Asparagus officinalis L. is a perennial plant with a long harvesting cycle and fast growth rate. The harvesting period of tender stems is relatively concentrated, and the shelf life of tender stems is very short. Therefore, the harvested asparagus needs to be classified according to the specifications of asparagus in a short time and then packaged and sold. However, at this stage, the classification of asparagus specifications basically depends on manual work, and it is difficult for asparagus of different specifications to rely on sensory grading, which requires a lot of money and labor. To save labor costs, an algorithm based on asparagus stem diameter classification was developed using deep learning and computer vision technology. This method selected YOLOv11 as the baseline model and makes several improvements, aiming to study a lightweight model for accurate grading of post-harvest asparagus. [Methods] This dataset was obtained by cell phone photography of post-harvest asparagus using fixed camera positions. In order to improve the generalization ability of the model, the training set was augmented with data by increasing contrast, mirroring, and adjusting brightness. The data-enhanced training set includes a total of 2 160 images for training the model. And the test set and validation set include 90 and 540 images respectively for inference and validation of the model. In order to enhance the performance of the improved model, the following four improvements were made to the baseline model, respectively. First, the efficient channel attention (ECA) module was added to the twelfth layer of the YOLOv11 backbone network. The ECA enhanced asparagus stem diameter feature extraction by dynamically adjusting channel weights in the convolutional neural network and improved the recognition accuracy of the improved model. Second, the bi-directional feature pyramid network (BiFPN) module was integrated into the neck network. This module modified the original feature fusion method to automatically emphasize key asparagus features and improved the grading accuracy through multi-scale feature fusion. What's more, BiFPN dynamically adjusted the importance of each layer to reduce redundant computations. Next, the slim-neck module was applied to optimize the neck network. The slim-neck Module consisted of GSConv and VOVGSCSP. The GSConv module replaced the traditional convolutional. And the VOVGSCSP module replaced the C2k3 module. This optimization reduced computational costs and model size while improving the recognition accuracy. Finally, the original YOLOv11 detection head was replaced with an EfficientDet Head. EfficientDet Head had the advantages of light weight and high accuracy. This head co-training with BiFPN to enhance the effect of multi-scale fusion and improve the performance of the model. [Results and Discussions] In order to verify the validity of the individual modules introduced in the improved YOLOv11 model and the superiority of the performance of the improved model, ablation experiments and comparison experiments were conducted respectively. The results of the comparison test between different attentional mechanisms added to the baseline model showed that the ECA module had better performance than other attentional mechanisms in the post-harvest asparagus grading task. The YOLOv11-ECA had higher recognition accuracy and smaller model size, so the selection of the ECA module had a certain degree of reliability. Ablation experiments demonstrated that the improved YOLOv11 achieved 96.8% precision (P), 96.9% recall (R), and 92.5% mean average precision (mAP), with 4.6 GFLOPs, 1.67 × 10⁶ parameters, and a 3.6 MB model. The results of the asparagus grading test indicated that the localization frames of the improved model were more accurate and had a higher and higher confidence level. Compared with the original YOLOv11 model, the improved YOLOv11 model increased the precision, recall, and mean average precision by 2.6, 1.4, and 2.2 percentage points, respectively. And the floating-point operation, parameter quantity, and model size were reduced by 1.7 G, 9.1 × 105, and 2.2 MB, respectively. Moreover, various improvements to the model could increase the accuracy of the model while ensuring that the model was light weight. In addition, the results of the comparative tests showed that the performance of the improved YOLOv11 model was better than those of SSD, YOLOv5s, YOLOv8n, YOLOv11, and YOLOv12. Overall, the improved YOLOv11 had the best overall performance, but still had some shortcomings. In terms of the real-time performance of the model, the inference speed of the improved model was not optimal, and the inference speed of the improved YOLOv11 was inferior to that of YOLOv5s and YOLOv8n. On this basis, to evaluate the inference speed of improved YOLOv11 and YOLOv11 used the aggregate test. The results of the Wilcoxon signed-rank test showed that the improved YOLOv11 had a significant improvement in inference speed compared to the original YOLOv11 model. [Conclusions] The improved YOLOv11 model demonstrated better recognition, lower parameters and floating-point operations, and smaller model size in the asparagus grading task. The improved YOLOv11 provided a theoretical foundation for intelligent post-harvest asparagus grading. Deploying the improved YOLOv11 model on asparagus grading equipment enables fast and accurate grading of post-harvest asparagus.

Key words: asparagus, object detection, image recognition, intelligent grading, YOLOv11

中图分类号: